Finalreport package

The module that performs/parsing file FinalReport.txt file.

class snplib.finalreport.FinalReport(allele: str | list | None = None, usecols: list[str] | None = None, dtype: dict | None = None, sep: str = '\t')[source]

Bases: object

File that contains SNP information. File processing is triggered by the handle method. If values in ‘SID’ or ‘UNIQ_KEY’ were missing in the xlsx conversion file, the processed data will contain NAN values.

Parameters:
  • allele – A variant form of a single nucleotide polymorphism (SNP), a specific polymorphic site or a whole gene detectable at a locus. Type: ‘AB’, ‘Forward’, ‘Top’, ‘Plus’, ‘Design’.

  • sep – Delimiter to use. Default value: “t”.

  • usecols – Selection of fields for reading. Accelerates processing and reduces memory.

  • dtype – Data type(s) to apply to either the whole dataset or individual columns. E.g., {‘a’: np.float64, ‘b’: np.int32, ‘c’: ‘Int64’}.

Example

[Header] GSGT Version 2.0.4 Processing Date 10/14/2021 4:02 PM Content BovineSNP50_v3_A1.bpm Num SNPs 53218 Total SNPs 53218 Num Samples 3 Total Samples 3 [Data] SNP Name Sample ID Allele1 - AB Allele2 - AB GC Score GT Score ABCA12 1 A A 0.4048 0.8164 APAF1 1 B B 0.9067 0.9155 …

__PATTERN_DATA = re.compile('(^\\[Data])')
__PATTERN_HEADER = re.compile('(^\\[Header])')
__allele
__convert_s_id(path_file: Path) None

Converts sample id which is in FinalReport to animal registration number.

Parameters:

path_file – xlsx file with animal numbers label

__dtype
__handler_data(file_rep: Path) None

Processes data and forms an array for further processing.

Parameters:

file_rep – path, pointer to the file to be read.

__handler_header(file_rep: Path) None

Processes data from a file, selects meta-information.

Parameters:

file_rep – path, pointer to the file to be read.

__header
__processing_columns(lst_col: list[str]) list[str] | None

Processing the line with all the names of the fields and the sample of them.

Parameters:

lst_col – List of all fields.

Returns:

Returns a tuple with a list of names of selected fields.

__sample_by_allele(names: list[str]) list[str] | None

Method that generates a list of field names choosing which alleles to keep

Parameters:

names – List of field names in the report file.

Returns:

Returns a filtered list of fields by alleles.

__snp_data: DataFrame | None
__usecols
static _check_on_ru_symbols(seq: Series) bool | None[source]

Checial verification of the Cyrillic

Parameters:

seq – Squeezed for verification.

Returns:

Truth if there are no symbols of Cyril and there is a lie if there is.

_delimiter
_map_rn
handle(file_rep: Path | str, conv_file: Path | str = None) bool[source]

Processes the FinalReport.txt file. Highlights meta information and data.

Parameters:
  • file_rep – The file FinalReport.txt or another name.

  • conv_file – The file that contains IDs of registration numbers of animals.

Returns:

Returns true if file processing was successful, false if there were errors.

property header: dict
property snp_data: DataFrame | None