PyPop.utils#
Module for common utility classes and functions.
Contains convenience classes for output of text and XML files.
Attributes#
Separator between genotypes |
|
Terminator of genotypes |
Classes#
Output stream for writing text files. |
|
Output stream for writing XML files. |
|
Matrix of strings and other metadata from input file to PyPop. |
|
Group list or sequence into non-overlapping chunks. |
|
A dictionary class with ordered pairs. |
|
Returns an Index object for |
Functions#
|
Log a CRITICAL message and exit with status 1. |
|
Get the type of stream. |
|
Use globbing with |
|
Generate a key for natural (human-friendly) sorting. |
|
Gets the unique elements in a list. |
|
Append a string to each element in a list. |
|
Convert line endings based on platform. |
|
Fix for some Windws/MS-DOS platforms. |
|
Copy file to file with fixes. |
|
Copy file to directory with fixes. |
|
Check XSL filename and return full path. |
|
Get user filename input. |
|
Divides a list up into n parcels (plus whatever is left over). |
Module Contents#
- GENOTYPE_SEPARATOR = '~'#
Separator between genotypes
Example
In a haplotype
01:01~13:01~04:02
- GENOTYPE_TERMINATOR = '~'#
Terminator of genotypes
Example
`02:01:01:01~
- class TextOutputStream(file)#
Output stream for writing text files.
- Parameters:
file (file) – file handle
- close()#
Close stream.
- flush()#
Flush to disk.
- class XMLOutputStream(file)#
Bases:
TextOutputStream
Output stream for writing XML files.
- opentag(tagname, **kw)#
Write an open XML tag to stream.
Tag attributes passed as optional named keyword arguments.
Example
opentag('tagname', role=something, id=else)produces the result:
<tagname role="something" id="else">Attribute and values are optional:
opentag('tagname')Produces:
<tagname>See also
Must be be followed by a
closetag().- Parameters:
tagname (str) – name of XML tag
- emptytag(tagname, **kw)#
Write an empty XML tag to stream.
This follows the same syntax as
opentag()but without XML content (but can contain attributes).Example
`emptytag('tagname', attr='val')produces:
<tagname attr="val"/>- Parameters:
tagname (str) – name of XML tag
- class StringMatrix(rowCount=None, colList=None, extraList=None, colSep='\t', headerLines=None)#
Bases:
numpy.lib.user_array.container
Matrix of strings and other metadata from input file to PyPop.
StringMatrixis a subclass of NumPy’snumpy.lib.user_arrayclass, store the data in an efficient array format, using NumPy-style access.- Parameters:
- dump(locus=None, stream=sys.stdout)#
Write file to a stream in original format.
- Parameters:
locus (str, optional) – write just specified locus, if omitted, default to all loci
stream (TextOutputStream|XMLOutputStream|stdout) – output stream
- copy()#
Make a (deep) copy.
- Returns:
a deep copy of the current object
- Return type:
- getNewStringMatrix(key)#
Create new StringMatrix containing specified loci.
Note
The format of the keys is identical to
__getitem__()except that it returns a fullStringMatrixinstance which includes all metadata
- getUniqueAlleles(key)#
Get naturally sorted list of unique alleles.
- convertToInts()#
Convert the matrix to integers.
Note
This function is used by the
PyPop.Haplo.Haplostatsclass. Note that integers start at 1 for compatibility with haplo-stats module- Returns:
matrix where the original allele names are now represented by integers
- Return type:
- countPairs()#
Count all possible pairs of haplotypes for each matrix row.
Warning
This does not do any involved handling of missing data as per
geno.count.pairsfrom Rhaplo.statsmodule.- Returns:
each element is the number of pairs in row order
- Return type:
- flattenCols()#
Flatten columns into a single list.
Important
Currently assumes entries are integers.
- Returns:
- all alleles, the two genotype columns concatenated
for each locus
- Return type:
- filterOut(key, blankDesignator)#
Get matrix rows filtered by a designator.
- getSuperType(key)#
Get a matrix grouped by specified key.
Example
Return a new matrix with the column vector with the alleles for each genotype concatenated like so:
>>> matrix = StringMatrix(2, ["A", "B"]) >>> matrix[0, "A"] = ("A01", "A02") >>> matrix[1, "A"] = ("A11", "A12") >>> matrix[0, "B"] = ("B01", "B02") >>> matrix[1, "B"] = ("B11", "B12") >>> print(matrix) StringMatrix([['A01', 'A02', 'B01', 'B02'], ['A11', 'A12', 'B11', 'B12']], dtype=object) >>> matrix.getSuperType("A:B") StringMatrix([['A01:B01', 'A02:B02'], ['A11:B11', 'A12:B12']], dtype=object)
- Parameters:
key (str) – loci to group
- Returns:
a new matrix with the columns concatenated
- Return type:
- class Group(li, size)#
Group list or sequence into non-overlapping chunks.
Example
>>> for pair in Group('aabbccddee', 2): ... print(pair) ... aa bb cc dd ee
>>> a = Group('aabbccddee', 2) >>> a[0] 'aa' >>> a[3] 'dd'
- class OrderedDict(hash=None)#
A dictionary class with ordered pairs.
Deprecated since version 1.3.1: Will be removed in a later release, to be replaced by internal Python version
Creates an ordered dict.
- index(key)#
Returns position of key in dict.
- keys()#
Returns list of keys in dict.
- values()#
Returns list of values in dict.
- items()#
Returns list of tuples of keys and values.
- insert(i, key, value)#
Inserts a key-value pair at a given index.
- remove(i)#
Removes a key-value pair from the dict.
- reverse()#
Reverses the order of the key-value pairs.
- sort(cmp=0)#
Sorts the dict (allows for sort algorithm).
- clear()#
Clears all the entries in the dict.
- copy()#
Makes copy of dict, also of OrderdDict class.
- get(key)#
Returns the value of a key.
- has_key(key)#
Looks for existence of key in dict.
- update(dict)#
Updates entries in a dict based on another.
- count(key)#
Finds occurrences of a key in a dict (0/1).
- class Index(i=0)#
Returns an Index object for
OrderedDict.Deprecated since version 1.3.1: Will be removed in a later release, to be replaced by internal Python version
- critical_exit(message, *args)#
Log a CRITICAL message and exit with status 1.
Added in version 1.4.0.
- Parameters:
message (str) – Logging format string.
- getStreamType(stream)#
Get the type of stream.
- Parameters:
stream (TextOutputStream|XMLOutputStream) – stream to check
- Returns:
either
xmlortext.- Return type:
string
- glob_with_pathlib(pattern)#
Use globbing with
pathlib.
- natural_sort_key(s, _nsre=re.compile('([0-9]+)'))#
Generate a key for natural (human-friendly) sorting.
This function splits a string into text and number components so that numbers are compared by value instead of lexicographically. It is intended for use as the
keyfunction inlist.sort()orsorted().Example
>>> items = ["item2", "item10", "item1"] >>> sorted(items, key=natural_sort_key) ['item1', 'item2', 'item10']
- Parameters:
s (str) – The string to split into text and number components.
_nsre (Pattern) – Precompiled regular expression used internally to split the string into digit and non-digit chunks. This is not intended to be overridden in normal use.
- Returns:
A list of strings and integers to be used as a sort key.
- Return type:
- unique_elements(li)#
Gets the unique elements in a list.
- appendTo2dList(aList, appendStr=':')#
Append a string to each element in a list.
- convertLineEndings(file, mode)#
Convert line endings based on platform.
- fixForPlatform(filename, txt_ext=0)#
Fix for some Windws/MS-DOS platforms.
- copyfileCustomPlatform(src, dest, txt_ext=0)#
Copy file to file with fixes.
- copyCustomPlatform(file, dist_dir, txt_ext=0)#
Copy file to directory with fixes.
- checkXSLFile(xslFilename, path='', subdir='', abort=False, msg='')#
Check XSL filename and return full path.
- Parameters:
- Returns:
checked and validaated path
- Return type:
- getUserFilenameInput(prompt, filename)#
Get user filename input.
Read user input for a filename, check its existence, continue requesting input until a valid filename is entered.
- splitIntoNGroups(alist, n=1)#
Divides a list up into n parcels (plus whatever is left over).
Example
>>> a = ['A', 'B', 'C', 'D', 'E'] >>> splitIntoNGroups(a, 2) [['A', 'B'], ['C', 'D'], ['E']]