Conversation¶
- 
class 
convokit.model.conversation.Conversation(owner, id: Optional[str] = None, utterances: Optional[List[str]] = None, meta: Optional[Dict] = None)¶ Represents a discrete subset of utterances in the dataset, connected by a reply-to chain.
- Parameters
 owner – The Corpus that this Conversation belongs to
id – The unique ID of this Conversation
utterances – A list of the IDs of the Utterances in this Conversation
meta – Table of initial values for conversation-level metadata
- Variables
 id – the ID of the Conversation
meta – A dictionary-like view object providing read-write access to conversation-level metadata.
- 
add_meta(key: str, value) → None¶ Adds a key-value pair to the metadata of the corpus object
- Parameters
 key – name of metadata attribute
value – value of metadata attribute
- Returns
 None
- 
add_vector(vector_name: str)¶ Logs in the Corpus component object’s internal vectors list that the component object has a vector row associated with it in the vector matrix named vector_name.
Transformers that add vectors to the Corpus should use this to update the relevant component objects during the transform() step.
- Parameters
 vector_name – name of vector matrix
- Returns
 None
- 
check_integrity(verbose: bool = True) → bool¶ Check the integrity of this Conversation; i.e. do the constituent utterances form a complete reply-to chain?
- Parameters
 verbose – whether to print errors indicating the problems with the Conversation
- Returns
 True if the conversation structure is complete else False
- 
delete_vector(vector_name: str)¶ Delete a vector associated with this Corpus component object.
- Parameters
 vector_name –
- Returns
 None
- 
get_chronological_speaker_list(selector: Callable[[convokit.model.speaker.Speaker], bool] = <function Conversation.<lambda>>)¶ Get the speakers in the conversation sorted in chronological order (speakers may appear more than once)
- Parameters
 selector – (lambda) function for which speakers should be included; all speakers are included by default
- Returns
 list of speakers for each chronological utterance
- 
get_chronological_utterance_list(selector: Callable[[convokit.model.utterance.Utterance], bool] = <function Conversation.<lambda>>)¶ Get the utterances in the conversation sorted in increasing order of timestamp
- Parameters
 selector – function for which utterances should be included; all utterances are included by default
- Returns
 list of utterances, sorted by timestamp
- 
get_longest_paths() → List[List[convokit.model.utterance.Utterance]]¶ Finds the Utterances form the longest path (i.e. root to leaf) in the Conversation tree. If there are multiple paths with tied lengths, returns all of them as a list of lists. If only one such path exists, a list containing a single list of Utterances is returned.
- Returns
 a list of lists of Utterances
- 
get_root_to_leaf_paths() → List[List[convokit.model.utterance.Utterance]]¶ Get the paths (stored as a list of lists of utterances) from the root to each of the leaves in the conversational tree
- Returns
 List of lists of Utterances
- 
get_speaker(speaker_id: str) → convokit.model.speaker.Speaker¶ Looks up the Speaker with the given name. Raises a KeyError if no speaker with that name exists.
- Returns
 the Speaker with the given speaker_id
- 
get_speaker_ids() → List[str]¶ Produces a list of ids of all speakers in the Conversation, which can be used in calls to get_speaker() to retrieve specific speakers. Provides no ordering guarantees for the list.
- Returns
 a list of speaker ids
- 
get_speakers_dataframe(selector: Optional[Callable[[convokit.model.speaker.Speaker], bool]] = <function Conversation.<lambda>>, exclude_meta: bool = False)¶ Get a DataFrame of the Speakers that have participated in the Conversation with fields and metadata attributes, with an optional selector that filters Speakers that should be included. Edits to the DataFrame do not change the corpus in any way.
- param exclude_meta
 whether to exclude metadata
- param selector
 selector: a (lambda) function that takes a Speaker and returns True or False (i.e. include / exclude). By default, the selector includes all Speakers in the Conversation.
- return
 a pandas DataFrame
- 
get_subtree(root_utt_id)¶ Get the utterance node of the specified input id
- Parameters
 root_utt_id – id of the root node that the subtree starts from
- Returns
 UtteranceNode object
- 
get_utterance(ut_id: str) → convokit.model.utterance.Utterance¶ Looks up the Utterance associated with the given ID. Raises a KeyError if no utterance by that ID exists.
- Returns
 the Utterance with the given ID
- 
get_utterance_ids() → List[str]¶ Produces a list of the unique IDs of all utterances in the Conversation, which can be used in calls to get_utterance() to retrieve specific utterances. Provides no ordering guarantees for the list.
- Returns
 a list of IDs of Utterances in the Conversation
- 
get_utterances_dataframe(selector=<function Conversation.<lambda>>, exclude_meta: bool = False)¶ Get a DataFrame of the Utterances in the COnversation with fields and metadata attributes. Set an optional selector that filters Utterances that should be included. Edits to the DataFrame do not change the corpus in any way.
- Parameters
 exclude_meta – whether to exclude metadata
selector – a (lambda) function that takes a Utterance and returns True or False (i.e. include / exclude). By default, the selector includes all Utterances in the Conversation.
- Returns
 a pandas DataFrame
- 
get_vector(vector_name: str, as_dataframe: bool = False, columns: Optional[List[str]] = None)¶ Get the vector stored as vector_name for this object.
- Parameters
 vector_name – name of vector
as_dataframe – whether to return the vector as a dataframe (True) or in its raw array form (False). False by default.
columns – optional list of named columns of the vector to include. All columns returned otherwise. This parameter is only used if as_dataframe is set to True
- Returns
 a numpy / scipy array
- 
iter_speakers(selector: Callable[[convokit.model.speaker.Speaker], bool] = <function Conversation.<lambda>>) → Generator[convokit.model.speaker.Speaker, None, None]¶ Get Speakers that have participated in the Conversation, with an optional selector that filters for Speakers that should be included.
- param selector
 a (lambda) function that takes a Speaker and returns True or False (i.e. include / exclude). By default, the selector includes all Speakers in the Conversation.
- return
 a generator of Speakers
- 
iter_utterances(selector: Callable[[convokit.model.utterance.Utterance], bool] = <function Conversation.<lambda>>) → Generator[convokit.model.utterance.Utterance, None, None]¶ Get utterances in the Corpus, with an optional selector that filters for Utterances that should be included.
- Parameters
 selector – a (lambda) function that takes an Utterance and returns True or False (i.e. include / exclude). By default, the selector includes all Utterances in the Conversation.
- Returns
 a generator of Utterances
- 
print_conversation_stats()¶ Helper function for printing the number of Utterances and Spekaers in the Conversation.
- Returns
 None (prints output)
- 
print_conversation_structure(utt_info_func: Callable[[convokit.model.utterance.Utterance], str] = <function Conversation.<lambda>>, limit: int = None) → None¶ Prints an indented representation of utterances in the Conversation with conversation reply-to structure determining the indented level. The details of each utterance to be printed can be configured.
If limit is set to a value other than None, this will annotate utterances with an ‘order’ metadata indicating their temporal order in the conversation, where the first utterance in the conversation is annotated with 1.
- Parameters
 utt_info_func – callable function taking an utterance as input and returning a string of the desired utterance information. By default, this is a lambda function returning the utterance’s speaker’s id
limit – maximum number of utterances to print out. if k, this includes the first k utterances.
- Returns
 None. Prints to stdout.
- 
retrieve_meta(key: str)¶ Retrieves a value stored under the key of the metadata of corpus object
- Parameters
 key – name of metadata attribute
- Returns
 value
- 
traverse(traversal_type: str, as_utterance: bool = True)¶ Traverse through the Conversation tree structure in a breadth-first search (‘bfs’), depth-first search (dfs), pre-order (‘preorder’), or post-order (‘postorder’) way.
- Parameters
 traversal_type – dfs, bfs, preorder, or postorder
as_utterance – whether the iterator should yield the utterance (True) or the utterance node (False)
- Returns
 an iterator of the utterances or utterance nodes