Corpus of Conversational Persian: Introduction 1 Ariana N. Mohammadi Abstract Keywords: Corpus linguistics; Corpora; Corpus of spoken Persian; Gender-tagged corpus 1. Description The Corpus of Conversational Persian (CCP) documents spontaneous, authentic, and naturally occurring informal interactions in Iranian Persian, Tehrani dialect. The CCP is part of the General Corpus of Persian (Mohammadi, 2018) and contains two sub- corpora: face-to-face conversations and phone calls. The corpus includes only texts (not 1 Content notice: The Corpus of Conversational Persian may contain coarse or offensive language. User discretion is advised. The Corpus of Conversational Persian (CCP) is a collection of transcribed spoken language data extracted from ~20 hours of naturally occurring informal conversations in Iranian Persian, Tehrani dialect. The data were collected by twenty-two research participants who recorded their daily phone calls and face-to-face interactions in a variety of informal settings. The corpus contains forty-three freestanding XML text files that are validated against an internal DTD declaration. The corpus is annotated and is tagged for gender. This paper provides a description of properties, composition, transcription, markup, and standardization of the corpus. Please cite this paper as: Mohammadi, A. N. (2019). Corpus of Conversational Persian: Introduction. DOI: 10.13140/RG.2.2.20630.09286/1.