Phylogenetic trees: Grammar versus vocabulary
- 作者: Polyakov V.N.1, Makarova E.A.2, Solovyev V.D.3
-
隶属关系:
- Institute of Linguistics of Russian Academy of Sciences
- Kazan Federal University
- 期: 卷 26, 编号 1 (2022)
- 页面: 31-50
- 栏目: Articles
- URL: https://journals.rudn.ru/linguistics/article/view/30639
- DOI: https://doi.org/10.22363/2687-0088-26460
如何引用文章
全文:
详细
Traditionally, genealogical relationships between languages are established on the basis of phonetic and lexical data. The question whether genealogical relationships among languages can be defined based on grammatical data remains unanswered. The objective of this article is to compare two phylogenetic trees: one built using the Automated Similarity Judgment Program (ASJP) project, and one using the World Atlas of Language Structures (WALS). We include data from WALS representing 27 languages from 5 language families of all continents that are deemed to be sufficiently well described. A Hamming distance matrix was calculated for all languages under study, and, based on the matrix, a phylogenetic tree was built. The trees built according to WALS and ASJP data are compared with each other and with a tree built by the classical comparative historical method. Both the ASJP-based tree and the WALS-based tree have their advantages and disadvantages. The ASJP-based tree is a good reflection of the evolutionary divergence of languages. Similarities of languages as calculated based on the typological database of WALS can provide information on the history of languages both in terms of genealogical descent and contact with other languages. The ASJP-based tree reflects genealogical relationship well at a relatively small time depth, while the WALS-based tree reflects genealogical relationship well at large time intervals. We suggest a new variant of a phylogenetic tree that includes information on both the divergence (ASJP project) and the convergence (WALS project) of languages, combining the benefits of both of these trees, although the problem of borrowings remains. The present research reveals prospects for future studies of genealogical relations among languages based on large-scale descriptions of their grammatical structures.
作者简介
Vladimir Polyakov
编辑信件的主要联系方式.
Email: MakarovaEA@iling-ran.ru
Elena Makarova
Institute of Linguistics of Russian Academy of Sciences
Email: MakarovaEA@iling-ran.ru
junior researcher, Section of Applied Linguistics 1 bld. 1 Bolshoy Kislovsky Lane, Moscow, 125009, Russia
Valery Solovyev
Kazan Federal University
Email: maki.solovyev@mail.ru
Doctor Habil. of Physics and mathematics, Professor, Chief Researcher of the “Text analytics” Research Laboratory 18 Kremlevskaya St., 420008, Kazan, Russia
参考
- Anisimov, Ivan, Vladimir N. Polyakov & Valery D. Solovyev. 2013. Database “Languages of the World.” New Version. New Research Horizons. In Svetlana Masalóva & Valery Solovyev (eds.), Proceedings of the First International Forum on Cognitive Modeling 14-21 September, 2013, Italy, Milano Marittima. Part 1. Cognitive modeling in linguistics: Proceedings of the XIV International Conference “Cognitive Modeling in Linguistics. CML-2013,” 27-34. Rostov-on-Don: Southern Federal University Press
- Barbançon, François et al. 2013. An experimental study comparing linguistic phylogenetic reconstruction methods. Diachronica 30(2). 143-170. https://doi.org/10.1075/Dia.30.2.01bar
- Bech, Kristin & George Walkden. 2016. English is (still) a West Germanic language. Nordic Journal of Linguistics 39(1). 65-100. https://doi.org/10.1017/S0332586515000219
- Benveniste, Emile. 1954. La classification des langues. Conferences de l'Institut de Linguistique de l'Universite de Paris 11. 33-50.
- Birchall, Joshua, Michael Dunn & Simon J. Greenhill. 2016. A combined comparative and phylogenetic analysis of the Chapacuran language family. International Journal of American Linguistics 82(3). 255-284.
- Bopp, Franz. 1885. A Comparative Grammar of the Sanskrit, Zend, Greek, Latin, Lithuanian, Gothic, German, and Slavonic Languages; Volume 1. London: Williams.
- Brown, Cecil H. et al. 2008. Automated classification of the world's languages: A description of the method and preliminary results. STUF - Language Typology and Universals 61(4). 285-308.
- Burlak, Svetlana & Sergei Starostin. 2005. Sravnitel'no-istoricheskoe Yazykoznanie [Comparative Linguistics]. Moscow: Publishing center “Academia.”
- Chang, Will et al. 2015. Ancestry-constrained phylogenetic analysis supports the Indo-European steppe hypothesis. Language 91(2). 194-244.
- Coloma, Germán. 2017. Complexity trade-offs in the 100-language WALS sample. Language Sciences 59. 148-158. https://doi.org/10.1016/j.langsci.201 6.10.006
- Donohue, Mark & Simon Musgrave. 2007. Typology and the linguistic macro-history of island Melanesia. Oceanic Linguistics 46. 348-387.
- Donohue, Mark et al. 2011. Typological feature analysis models linguistic geography. Language 87(2). 369-383.
- Donohue, Mark et al. 2008. Typology, areality and diffusion. Oceanic Linguistics 47(1). 223-232.
- Dryer, Matthew S. & Martin Haspelmath (eds.). 2013. The World Atlas of Language Structures online. Leipzig: Max Planck Institute for Evolutionary Anthropology. http://wals.info (accessed 20 January 2018).
- Dunn, Michael. 2015. Language phylogenies. In Claire Bowern & Bethwyn Evans (eds.), The Routledge handbook of historical linguistics, 190-211. New York: Routledge.
- Dunn, Michael et al. 2007. Statistical reasoning in the evaluation of typological diversity in Island Melanesia. Oceanic Linguistics 46. 388-403.
- Dunn, Michael et al. 2008. Structural phylogeny in historical linguistics: Methodological explorations applied in Island Melanesia. Language 84(4). 710-759. https://doi.org/10.1353/lan.0.0069
- Dunn, Michael et al. 2005. Structural phylogenetics and the reconstruction of ancient language history. Science 309. 2072-2075.
- Edwards, Anthony, William Fairbank & Luigi Luca Cavalli-Sforza. 1964. Reconstruction of evolutionary trees. Phonetic and Phylogenetic Classification. Systematics Association Publication 6. 67-76. http://www.faculty.biol.ttu.edu/Strauss/Phylogenetics/Readings/EdwardsCavalliSforza1964.pdf (accessed 21 February 2018).
- Emonds, Joseph Embley & Jan Terje Faarlund. 2014. English: The language of the Vikings. Olomouc modern language monographs 3. Olomouc: Palacký University
- Felsenstein, Joseph. 2003. Inferring Phylogenies. Sunderland, MA: Sinauer Associates
- Gray, Russell D. & Quentin D. Atkinson. 2003. Language-tree divergence times support the Anatolian theory of Indo-European Origin. Nature 426(6965). 435-439. https://doi.org/10.1038/nature02029
- Gray, Russell D. & Fiona M. Jordan. 2000. Language trees support the express-train sequence of Austronesian expansion. Nature 405(6790). 1052-1055.
- Gray, Russell D., David Bryant & Simon J. Greenhill. 2010. On the shape and fabric of human history. Philosophical Transactions of the Royal Society B 365. 3923-3933.
- Hamming, Richard W. 1950. Error detecting and error correcting codes. Bell System Technical Journal 29(2). 147-160. https://doi.org/10.1002/j.1538-7305.1950.tb00463.x
- Haspelmath, Martin et al. 2005. The World Atlas of language structures. 1 st ed. Oxford: Oxford University Press.
- Holman, Eric W. & Søren Wichmann. 2017. New evidence from linguistic phylogenetics supports phyletic gradualism. Systematic Biology 66(4). 604-610. https://doi.org/10.1093/sysbio/syw106
- Holman, Eric W. et al. 2008. Explorations in automated language classification. Folia Linguistica 42(2). 331-354
- Hornung, Annette. 2017. English: The Grammar of the Danelaw. Arizona State University, ProQuest Dissertations Publishing
- Kolipakam, Vishnupriya et al. 2018. A Bayesian phylogenetic study of the Dravidian language family. Royal Society Open Science 5(3). https://doi.org/10.1098/rsos.171504
- Kumar, Sudhir, Glen Stecher & Koichiro Tamura. 2016. MEGA7: Molecular Evolutionary Genetics Analysis version 7.0 for bigger datasets. Molecular Biology and Evolution 33(7). 1870-1874. https://doi.org/10.1093/molbev/msw054
- Lewis, Paul M. (ed.). 2009. Ethnologue: Languages of the World (Sixteenth edition). Dallas, Texas: SIL International. http://www.ethnologue.com/ (accessed 20 January 2018)
- Longobardi, Giuseppe et al. 2015. Toward a syntactic phylogeny of modern Indo-European languages. In Leonid Kulikov & Nikolaos Lavidas (eds.), Proto-Indo-European syntax and its development, 125-156. Amsterdam: John Benjamins Publishing Company
- Lutz, Angelika. Norse loans in Middle English and their influence on Late Medieval London English. Anglia 135(2). 317-357. https://doi.org/10.1515/ang-2017-0028
- Maslov, Yu. 2005. Bolgarsky [Bulgarian]. In Languages of the World. Slavic languages. 69-102. Moscow: Academia
- Mattila, Heikki E. S. 2006. Legal language: History. In Keith Brown (ed.), Encyclopedia of language and linguistics (2nd ed.), 8-13. London: Elsevier. https://doi.org/10.1016/B0-08-044854-2/04504-1
- Müller, André et al. 2013. ASJP World Language Tree of Lexical Similarity: Version 4 (October 2013). http://asjp.clld.org/
- Nakleh, Luay, Don Ringe & Tandy Warnow. 2005. Perfect phylogenetic networks: A new methodology for reconstructing the evolutionary history of natural languages. Language 81(2). 382-420.
- Nichols, Johanna & Tandy Warnow. 2008. Tutorial on computational linguistic phylogeny. Language and Linguistics Compass 5(2). 760-820.
- Oransky, Iosif Mikhailovich. 1979. Iranian Languages in Historical Perspective. Moscow: Nauka. http://www.orientalstudies.ru/rus/index.php?option=com_publications&Itemid=75&pub=619 (accessed 20 March 2021).
- Pagel, M. & Meade, A. 2006. Bayesian analysis of correlated evolution of discrete characters by reversible-jump Markov chain Monte Carlo. American Naturalist 167. 808-825
- Polyakov, Vladimir N., Ivan S. Anisimov & Elena A. Makarova. 2016. Can grammar define similarity of human natural languages? American Journal of Applied Sciences 13(10). 1040-1052. https://doi.org/10.3844/ajassp.2016.1040.1052
- Polyakov, Vladimir N. et al. 2009. Using WALS and languages of the world. Linguistic Typology 13(1). 137-167. https://doi.org/10.1515/LITY.2009.008
- Pompei, Simone, Vittorio Loreto & Francesca Tria. 2011. On the accuracy of language trees. PLoS ONE 6(6). e20109. https://doi.org/10.1371/journal.pone.0020109
- Rama, Taraka & Lars Borin. 2015. Comparative evaluation of string similarity measures for automatic language classification. Sequences in language and text. http://spraakdata.gu.se/taraka/string-similarities-pdf2doc.pdf (accessed 13 April 2021).
- Ringe, Don, Tandy Warnow & Ann Taylor. 2002. Indo-European and computational cladistics. Transactions of the Philological Society 100(1). 59-129.
- Schrijver, Peter. 2014. Language Contact and the Origins of the Germanic Languages. New York: Routledge.
- Seebold, Elmar S. 2006. Westgermanische Sprachen [West Germanic Languages], Reallexikon der germanischen Altertumskunde 33. 530-536.
- Solovyev, Valery. 2009. Is grammochronology possible? Proceedings of the Swadesh Centenary Conference, 17-18 January 2009. Munich: Official Website of Max Planck Institute for Evolutionary Anthropology. http://www.eva.mpg.de/lingua/conference/09_SwadeshCentenary/pdf/abstracts/Valery_Solovyev.pdf (accessed 15 April 2021).
- Swadesh, Morris. 1950. Salish internal relationships. International Journal of American Linguistics 16. 157-167.
- Swadesh, Morris. 1952. Lexicostatistic dating of prehistoric ethnic contacts. Proceedings of the American Philosophical Society 96. 452-463.
- Swadesh, Morris. 1955. Towards greater accuracy in lexicostatistic dating. International Journal of American Linguistics 21. 121-137. https://lib.ugent.be/catalog/rug01:002194436 (accessed 5 May 2021).
- Trubetzkoy, Nikolai S. 1939. Gedanken über das Indogermanenproblem [Commemoration of the Indo-Germanic problem]. Acta Linguistica. Copenhague 1(1). 81-89. https://doi.org/10.1080/03740463.1939.10410851
- Wichmann, Søren. 2013. A classification of Papuan languages. In: Hammarström, Harald and Wilco van den Heuvel (eds.), History, contact and classification of Papuan languages (Language and Linguistics in Melanesia, Special Issue 2012), 313-386. Port Moresby: Linguistic Society of Papua New Guinea
- Wichmann, Søren. 2017a. Genealogical classification in historical linguistics. In Mark Aronoff (ed.), Oxford research encyclopedias: Linguistics. Oxford: Oxford University Press. https://doi.org/10.1093/acrefore/9780199384655.013.78
- Wichmann, Søren. 2017b. Modeling language family expansions. Diachronica 34(1). 79-101. https://doi.org/10.1075/dia.34.1.03wic
- Wichmann, Søren & Eric W. Holman. 2010b. Pairwise comparisons of typological profiles. In Jan Wohlgemuth & Michael Cysouw (eds.), Rethinking universals: How rarities affect linguistic theory, 241-254. Berlin/New York: Walter de Gruyter Publishers. https://doi.org/10.1.1.558.3743&rep=rep1&type=pdf
- Wichmann, Søren & Jeff Good. 2014. Introduction. In Søren Wichmann & Jeff Good (eds.), Quantifying language dynamics: On the cutting edge of areal and phylogenetic linguistics, 1-6. Leiden: Brill.
- Wichmann, Søren & Eric W. Holman. 2009. Temporal Stability of Linguistic Typological Features. München: LINCOM Europa.
- Wichmann, Søren et al. 2010b. Evaluating linguistic distance measures. Physica A 389. 3632-3639. https://doi.org/10.1016/j.physa. 2010.05.011
- Wichmann, Søren, Eric W. Holman & Cecil H. Brown (eds.). 2016. The ASJP Database (version 17). Available at: http://asjp.clld.org/ (Accessed 4 February 2018).
- Wichmann, Søren, André Müller & Viveka Velupillai. 2010a. Homelands of the world’s language families: A quantitative approach. Diachronica 27(2). 247-276.
- Wichmann, Søren & Taraka Rama. 2018. Jackknifing the black sheep: ASJP classification performance and Austronesian. In Kikusawa, Ritsuko and Lawrence A. Reid (eds.), Let’s talk about Trees: Genetic relationships of languages and their phylogenic representation, 39-58. Senri Ethnological Studies 98. Osaka: National Museum of Ethnology, Japan
- Wichmann, Søren & Arpiar Saunders. 2007. How to use typological databases in historical linguistic research. Diachronica 24.2. 373-404.
- Wong, Kok-Seng & Myung Ho Kim. 2014. On private Hamming distance computation. The Journal of Supercomputing 69(3). 1123-1138. https://doi.org/10.1007/s11227-013-1063-z