Phylogenetic trees: Grammar versus vocabulary

封面

如何引用文章

详细

Traditionally, genealogical relationships between languages are established on the basis of phonetic and lexical data. The question whether genealogical relationships among languages can be defined based on grammatical data remains unanswered. The objective of this article is to compare two phylogenetic trees: one built using the Automated Similarity Judgment Program (ASJP) project, and one using the World Atlas of Language Structures (WALS). We include data from WALS representing 27 languages from 5 language families of all continents that are deemed to be sufficiently well described. A Hamming distance matrix was calculated for all languages under study, and, based on the matrix, a phylogenetic tree was built. The trees built according to WALS and ASJP data are compared with each other and with a tree built by the classical comparative historical method. Both the ASJP-based tree and the WALS-based tree have their advantages and disadvantages. The ASJP-based tree is a good reflection of the evolutionary divergence of languages. Similarities of languages as calculated based on the typological database of WALS can provide information on the history of languages both in terms of genealogical descent and contact with other languages. The ASJP-based tree reflects genealogical relationship well at a relatively small time depth, while the WALS-based tree reflects genealogical relationship well at large time intervals. We suggest a new variant of a phylogenetic tree that includes information on both the divergence (ASJP project) and the convergence (WALS project) of languages, combining the benefits of both of these trees, although the problem of borrowings remains. The present research reveals prospects for future studies of genealogical relations among languages based on large-scale descriptions of their grammatical structures.

作者简介

Vladimir Polyakov

编辑信件的主要联系方式.
Email: MakarovaEA@iling-ran.ru

Elena Makarova

Institute of Linguistics of Russian Academy of Sciences

Email: MakarovaEA@iling-ran.ru
junior researcher, Section of Applied Linguistics 1 bld. 1 Bolshoy Kislovsky Lane, Moscow, 125009, Russia

Valery Solovyev

Kazan Federal University

Email: maki.solovyev@mail.ru
Doctor Habil. of Physics and mathematics, Professor, Chief Researcher of the “Text analytics” Research Laboratory 18 Kremlevskaya St., 420008, Kazan, Russia

参考

  1. Anisimov, Ivan, Vladimir N. Polyakov & Valery D. Solovyev. 2013. Database “Languages of the World.” New Version. New Research Horizons. In Svetlana Masalóva & Valery Solovyev (eds.), Proceedings of the First International Forum on Cognitive Modeling 14-21 September, 2013, Italy, Milano Marittima. Part 1. Cognitive modeling in linguistics: Proceedings of the XIV International Conference “Cognitive Modeling in Linguistics. CML-2013,” 27-34. Rostov-on-Don: Southern Federal University Press
  2. Barbançon, François et al. 2013. An experimental study comparing linguistic phylogenetic reconstruction methods. Diachronica 30(2). 143-170. https://doi.org/10.1075/Dia.30.2.01bar
  3. Bech, Kristin & George Walkden. 2016. English is (still) a West Germanic language. Nordic Journal of Linguistics 39(1). 65-100. https://doi.org/10.1017/S0332586515000219
  4. Benveniste, Emile. 1954. La classification des langues. Conferences de l'Institut de Linguistique de l'Universite de Paris 11. 33-50.
  5. Birchall, Joshua, Michael Dunn & Simon J. Greenhill. 2016. A combined comparative and phylogenetic analysis of the Chapacuran language family. International Journal of American Linguistics 82(3). 255-284.
  6. Bopp, Franz. 1885. A Comparative Grammar of the Sanskrit, Zend, Greek, Latin, Lithuanian, Gothic, German, and Slavonic Languages; Volume 1. London: Williams.
  7. Brown, Cecil H. et al. 2008. Automated classification of the world's languages: A description of the method and preliminary results. STUF - Language Typology and Universals 61(4). 285-308.
  8. Burlak, Svetlana & Sergei Starostin. 2005. Sravnitel'no-istoricheskoe Yazykoznanie [Comparative Linguistics]. Moscow: Publishing center “Academia.”
  9. Chang, Will et al. 2015. Ancestry-constrained phylogenetic analysis supports the Indo-European steppe hypothesis. Language 91(2). 194-244.
  10. Coloma, Germán. 2017. Complexity trade-offs in the 100-language WALS sample. Language Sciences 59. 148-158. https://doi.org/10.1016/j.langsci.201 6.10.006
  11. Donohue, Mark & Simon Musgrave. 2007. Typology and the linguistic macro-history of island Melanesia. Oceanic Linguistics 46. 348-387.
  12. Donohue, Mark et al. 2011. Typological feature analysis models linguistic geography. Language 87(2). 369-383.
  13. Donohue, Mark et al. 2008. Typology, areality and diffusion. Oceanic Linguistics 47(1). 223-232.
  14. Dryer, Matthew S. & Martin Haspelmath (eds.). 2013. The World Atlas of Language Structures online. Leipzig: Max Planck Institute for Evolutionary Anthropology. http://wals.info (accessed 20 January 2018).
  15. Dunn, Michael. 2015. Language phylogenies. In Claire Bowern & Bethwyn Evans (eds.), The Routledge handbook of historical linguistics, 190-211. New York: Routledge.
  16. Dunn, Michael et al. 2007. Statistical reasoning in the evaluation of typological diversity in Island Melanesia. Oceanic Linguistics 46. 388-403.
  17. Dunn, Michael et al. 2008. Structural phylogeny in historical linguistics: Methodological explorations applied in Island Melanesia. Language 84(4). 710-759. https://doi.org/10.1353/lan.0.0069
  18. Dunn, Michael et al. 2005. Structural phylogenetics and the reconstruction of ancient language history. Science 309. 2072-2075.
  19. Edwards, Anthony, William Fairbank & Luigi Luca Cavalli-Sforza. 1964. Reconstruction of evolutionary trees. Phonetic and Phylogenetic Classification. Systematics Association Publication 6. 67-76. http://www.faculty.biol.ttu.edu/Strauss/Phylogenetics/Readings/EdwardsCavalliSforza1964.pdf (accessed 21 February 2018).
  20. Emonds, Joseph Embley & Jan Terje Faarlund. 2014. English: The language of the Vikings. Olomouc modern language monographs 3. Olomouc: Palacký University
  21. Felsenstein, Joseph. 2003. Inferring Phylogenies. Sunderland, MA: Sinauer Associates
  22. Gray, Russell D. & Quentin D. Atkinson. 2003. Language-tree divergence times support the Anatolian theory of Indo-European Origin. Nature 426(6965). 435-439. https://doi.org/10.1038/nature02029
  23. Gray, Russell D. & Fiona M. Jordan. 2000. Language trees support the express-train sequence of Austronesian expansion. Nature 405(6790). 1052-1055.
  24. Gray, Russell D., David Bryant & Simon J. Greenhill. 2010. On the shape and fabric of human history. Philosophical Transactions of the Royal Society B 365. 3923-3933.
  25. Hamming, Richard W. 1950. Error detecting and error correcting codes. Bell System Technical Journal 29(2). 147-160. https://doi.org/10.1002/j.1538-7305.1950.tb00463.x
  26. Haspelmath, Martin et al. 2005. The World Atlas of language structures. 1 st ed. Oxford: Oxford University Press.
  27. Holman, Eric W. & Søren Wichmann. 2017. New evidence from linguistic phylogenetics supports phyletic gradualism. Systematic Biology 66(4). 604-610. https://doi.org/10.1093/sysbio/syw106
  28. Holman, Eric W. et al. 2008. Explorations in automated language classification. Folia Linguistica 42(2). 331-354
  29. Hornung, Annette. 2017. English: The Grammar of the Danelaw. Arizona State University, ProQuest Dissertations Publishing
  30. Kolipakam, Vishnupriya et al. 2018. A Bayesian phylogenetic study of the Dravidian language family. Royal Society Open Science 5(3). https://doi.org/10.1098/rsos.171504
  31. Kumar, Sudhir, Glen Stecher & Koichiro Tamura. 2016. MEGA7: Molecular Evolutionary Genetics Analysis version 7.0 for bigger datasets. Molecular Biology and Evolution 33(7). 1870-1874. https://doi.org/10.1093/molbev/msw054
  32. Lewis, Paul M. (ed.). 2009. Ethnologue: Languages of the World (Sixteenth edition). Dallas, Texas: SIL International. http://www.ethnologue.com/ (accessed 20 January 2018)
  33. Longobardi, Giuseppe et al. 2015. Toward a syntactic phylogeny of modern Indo-European languages. In Leonid Kulikov & Nikolaos Lavidas (eds.), Proto-Indo-European syntax and its development, 125-156. Amsterdam: John Benjamins Publishing Company
  34. Lutz, Angelika. Norse loans in Middle English and their influence on Late Medieval London English. Anglia 135(2). 317-357. https://doi.org/10.1515/ang-2017-0028
  35. Maslov, Yu. 2005. Bolgarsky [Bulgarian]. In Languages of the World. Slavic languages. 69-102. Moscow: Academia
  36. Mattila, Heikki E. S. 2006. Legal language: History. In Keith Brown (ed.), Encyclopedia of language and linguistics (2nd ed.), 8-13. London: Elsevier. https://doi.org/10.1016/B0-08-044854-2/04504-1
  37. Müller, André et al. 2013. ASJP World Language Tree of Lexical Similarity: Version 4 (October 2013). http://asjp.clld.org/
  38. Nakleh, Luay, Don Ringe & Tandy Warnow. 2005. Perfect phylogenetic networks: A new methodology for reconstructing the evolutionary history of natural languages. Language 81(2). 382-420.
  39. Nichols, Johanna & Tandy Warnow. 2008. Tutorial on computational linguistic phylogeny. Language and Linguistics Compass 5(2). 760-820.
  40. Oransky, Iosif Mikhailovich. 1979. Iranian Languages in Historical Perspective. Moscow: Nauka. http://www.orientalstudies.ru/rus/index.php?option=com_publications&Itemid=75&pub=619 (accessed 20 March 2021).
  41. Pagel, M. & Meade, A. 2006. Bayesian analysis of correlated evolution of discrete characters by reversible-jump Markov chain Monte Carlo. American Naturalist 167. 808-825
  42. Polyakov, Vladimir N., Ivan S. Anisimov & Elena A. Makarova. 2016. Can grammar define similarity of human natural languages? American Journal of Applied Sciences 13(10). 1040-1052. https://doi.org/10.3844/ajassp.2016.1040.1052
  43. Polyakov, Vladimir N. et al. 2009. Using WALS and languages of the world. Linguistic Typology 13(1). 137-167. https://doi.org/10.1515/LITY.2009.008
  44. Pompei, Simone, Vittorio Loreto & Francesca Tria. 2011. On the accuracy of language trees. PLoS ONE 6(6). e20109. https://doi.org/10.1371/journal.pone.0020109
  45. Rama, Taraka & Lars Borin. 2015. Comparative evaluation of string similarity measures for automatic language classification. Sequences in language and text. http://spraakdata.gu.se/taraka/string-similarities-pdf2doc.pdf (accessed 13 April 2021).
  46. Ringe, Don, Tandy Warnow & Ann Taylor. 2002. Indo-European and computational cladistics. Transactions of the Philological Society 100(1). 59-129.
  47. Schrijver, Peter. 2014. Language Contact and the Origins of the Germanic Languages. New York: Routledge.
  48. Seebold, Elmar S. 2006. Westgermanische Sprachen [West Germanic Languages], Reallexikon der germanischen Altertumskunde 33. 530-536.
  49. Solovyev, Valery. 2009. Is grammochronology possible? Proceedings of the Swadesh Centenary Conference, 17-18 January 2009. Munich: Official Website of Max Planck Institute for Evolutionary Anthropology. http://www.eva.mpg.de/lingua/conference/09_SwadeshCentenary/pdf/abstracts/Valery_Solovyev.pdf (accessed 15 April 2021).
  50. Swadesh, Morris. 1950. Salish internal relationships. International Journal of American Linguistics 16. 157-167.
  51. Swadesh, Morris. 1952. Lexicostatistic dating of prehistoric ethnic contacts. Proceedings of the American Philosophical Society 96. 452-463.
  52. Swadesh, Morris. 1955. Towards greater accuracy in lexicostatistic dating. International Journal of American Linguistics 21. 121-137. https://lib.ugent.be/catalog/rug01:002194436 (accessed 5 May 2021).
  53. Trubetzkoy, Nikolai S. 1939. Gedanken über das Indogermanenproblem [Commemoration of the Indo-Germanic problem]. Acta Linguistica. Copenhague 1(1). 81-89. https://doi.org/10.1080/03740463.1939.10410851
  54. Wichmann, Søren. 2013. A classification of Papuan languages. In: Hammarström, Harald and Wilco van den Heuvel (eds.), History, contact and classification of Papuan languages (Language and Linguistics in Melanesia, Special Issue 2012), 313-386. Port Moresby: Linguistic Society of Papua New Guinea
  55. Wichmann, Søren. 2017a. Genealogical classification in historical linguistics. In Mark Aronoff (ed.), Oxford research encyclopedias: Linguistics. Oxford: Oxford University Press. https://doi.org/10.1093/acrefore/9780199384655.013.78
  56. Wichmann, Søren. 2017b. Modeling language family expansions. Diachronica 34(1). 79-101. https://doi.org/10.1075/dia.34.1.03wic
  57. Wichmann, Søren & Eric W. Holman. 2010b. Pairwise comparisons of typological profiles. In Jan Wohlgemuth & Michael Cysouw (eds.), Rethinking universals: How rarities affect linguistic theory, 241-254. Berlin/New York: Walter de Gruyter Publishers. https://doi.org/10.1.1.558.3743&rep=rep1&type=pdf
  58. Wichmann, Søren & Jeff Good. 2014. Introduction. In Søren Wichmann & Jeff Good (eds.), Quantifying language dynamics: On the cutting edge of areal and phylogenetic linguistics, 1-6. Leiden: Brill.
  59. Wichmann, Søren & Eric W. Holman. 2009. Temporal Stability of Linguistic Typological Features. München: LINCOM Europa.
  60. Wichmann, Søren et al. 2010b. Evaluating linguistic distance measures. Physica A 389. 3632-3639. https://doi.org/10.1016/j.physa. 2010.05.011
  61. Wichmann, Søren, Eric W. Holman & Cecil H. Brown (eds.). 2016. The ASJP Database (version 17). Available at: http://asjp.clld.org/ (Accessed 4 February 2018).
  62. Wichmann, Søren, André Müller & Viveka Velupillai. 2010a. Homelands of the world’s language families: A quantitative approach. Diachronica 27(2). 247-276.
  63. Wichmann, Søren & Taraka Rama. 2018. Jackknifing the black sheep: ASJP classification performance and Austronesian. In Kikusawa, Ritsuko and Lawrence A. Reid (eds.), Let’s talk about Trees: Genetic relationships of languages and their phylogenic representation, 39-58. Senri Ethnological Studies 98. Osaka: National Museum of Ethnology, Japan
  64. Wichmann, Søren & Arpiar Saunders. 2007. How to use typological databases in historical linguistic research. Diachronica 24.2. 373-404.
  65. Wong, Kok-Seng & Myung Ho Kim. 2014. On private Hamming distance computation. The Journal of Supercomputing 69(3). 1123-1138. https://doi.org/10.1007/s11227-013-1063-z

版权所有 © Polyakov V., Makarova E., Solovyev V., 2022

Creative Commons License
此作品已接受知识共享署名-非商业性使用 4.0国际许可协议的许可。

##common.cookie##