compact.reciprocal_hits

module to find protein pairs that are reciprocal “top” hits with options for different criteria for “top” hit cut-off

main functions:
  • get_reciprocal_top_hits: determine reciprocal top hits from rbo score matrix between two datasets

compact.reciprocal_hits.get_reciprocal_top_hits(scores, score_type='between', criterium='percent', percent=1, out_type='series')

determine reciprocal top hits for given score matrix

Args:
scores (pd df): matrix with interaction scores

in case of ‘within’ scores: symmetric matrix with same ids in index and columns.

score_type (str, optional): Defaults to ‘between’.

type of interaction scores. either ‘within’ or ‘between’ ‘within’ for interaction scores within a single sample ‘between’ for interaction scores between two samples

criterium (‘best’ or ‘percent’): top hit criterium type

if ‘best’: returns only the single best hit if ‘percent’: takes the top n % of highest scoring hits

percent (numeric, optional): top percentage Defaults to 1.

if criterium is percent, returns top n percent of hits

out_type (str, optional): Defaults to ‘series’.

if ‘dict’, output is dict. if anything else, output is a dict

Raises:

ValueError: when score_type parameter is invalid

Returns:
pd series: reciprocal top hits in pd series format

2-level multiindex with id pair, values are scores OR

top_hits_dict (dict): dictionary with reciprocal

top hits. strucure: {(‘l_id’,’r_id’):score}