Spelling performance on the web and in the lab
Autoři:
Arnaud Rey aff001; Jean-Luc Manguin aff003; Chloé Olivier aff001; Sébastien Pacton aff004; Pierre Courrieu aff001
Působiště autorů:
Laboratoire de Psychologie Cognitive, CNRS—Aix-Marseille Université, Marseille, France
aff001; Institute of Language, Communication and the Brain, Aix-Marseille Université, Marseille, France
aff002; GREYC, CNRS—Université de Caen Basse-Normandie–ENSICAEN, Caen, France
aff003; Laboratoire Mémoire, Cerveau et Cognition, Université Paris Descartes, Paris, France
aff004
Vyšlo v časopise:
PLoS ONE 14(12)
Kategorie:
Research Article
doi:
https://doi.org/10.1371/journal.pone.0226647
Souhrn
Several dictionary websites are available on the web to access semantic, synonymous, or spelling information about a given word. During nine years, we systematically recorded all the entered letter sequences from a French web dictionary. A total of 200 million orthographic forms were obtained allowing us to create a large-scale database of spelling errors that could inform psychological theories about spelling processes. To check the reliability of this big data methodology, we selected from this database a sample of 100 frequently misspelled words. A group of 100 French university students had to perform a spelling-to-dictation test on this list of words. The results showed a strong correlation between the two data sets on the frequencies of produced spellings (r = 0.82). Although the distributions of spelling errors were relatively consistent across the two databases, the proportion of correct responses revealed significant differences. Regression analyses allowed us to generate possible explanations for these differences in terms of task-dependent factors. We argue that comparing the results of these large-scale databases with those of standard and controlled experimental paradigms is certainly a good way to determine the conditions under which this big data methodology can be adequately used for informing psychological theories.
Klíčová slova:
Database and informatics methods – Experimental psychology – Information retrieval – Lexicons – Phonemes – Phonology – Regression analysis – Semantics
Zdroje
1. Mayer-Schönberger V, Cukier K. Big data: A revolution that will transform how we live, work and think. John Murray.; 2013.
2. Houghton G, Zorzi M. Normal and impaired spelling in a connectionist dual-route architecture. Cogn Neuropsychol. 2003;20: 115–162. doi: 10.1080/02643290242000871 20957568
3. Krevisky J, Linfield JL. The Bad Spellers Dictionary. Random House Reference; 1974.
4. Rey A, Courrieu P, Schmidt-Weigand F, Jacobs AM. Item performance in visual word recognition. Psychon Bull Rev. 2009;16: 600–608. doi: 10.3758/PBR.16.3.600 19451391
5. Rey A, Courrieu P. Accounting for Item Variance in Large-scale Databases. Front Psychol. 2010;1. doi: 10.3389/fpsyg.2010.00200 21738520
6. Spieler DH, Balota DA. Bringing Computational Models of Word Naming Down to the Item Level. Psychol Sci. 1997;8: 411–416. doi: 10.1111/j.1467-9280.1997.tb00453.x
7. Courrieu P, Rey A. Missing data imputation and corrected statistics for large-scale behavioral databases. Behav Res Methods. 2011;43: 310–330. doi: 10.3758/s13428-011-0071-2 21424187
8. Courrieu P, Brand-D’abrescia M, Peereman R, Spieler D, Rey A. Validated intraclass correlation statistics to test item performance models. Behav Res Methods. 2011;43: 37–55. doi: 10.3758/s13428-010-0020-5 21287127
9. Perry C, Ziegler JC, Zorzi M. Beyond single syllables: Large-scale modeling of reading aloud with the Connectionist Dual Process (CDP++) model. Cognit Psychol. 2010;61: 106–151. doi: 10.1016/j.cogpsych.2010.04.001 20510406
10. Véronis J. From sound to spelling in French: Simulation on a computer. Eur Bull Cogn Psychol. 1988;8: 315–334.
11. Peereman R, Lété B, Sprenger-Charolles L. Manulex-infra: Distributional characteristics of grapheme—phoneme mappings, and infralexical and lexical units in child-directed written material. Behav Res Methods. 2007;39: 579–589. doi: 10.3758/bf03193029 17958171
12. Ziegler JC, Jacobs AM, Stone GO. Statistical analysis of the bidirectional inconsistency of spelling and sound in French. Behav Res Methods Instrum Comput. 1996;28: 504–515. doi: 10.3758/BF03200539
13. Ziegler JC, Stone GO, Jacobs AM. What is the pronunciation for _OUGH and the spelling for /u/? A database for computing feedforward and feedback consistency in English. Behav Res Methods Instrum Comput. 1997;29: 600–618. doi: 10.3758/BF03210615
14. Manesse D, Chervel A, Cogis D. Orthographe: A qui la faute? Paris: ESF; 2007.
15. Gingras M, Sénéchal M. Silex: A database for silent-letter endings in French words. Behav Res Methods. 2017;49: 1894–1904. doi: 10.3758/s13428-016-0832-z 27864813
16. Sénéchal M, Gingras M, L’Heureux L. Modeling Spelling Acquisition: The Effect of Orthographic Regularities on Silent-Letter Representations: Scientific Studies of Reading: Vol 20, No 2. 2015;20: 155–162. doi: doi.org/10.1080/10888438.2015.1098650
17. Pacton S, Deacon H. The timing and mechanisms of children’s use of morphological information in spelling: A review of evidence from English and French. Cogn Dev. 2008;23: 339–359. doi: 10.1016/j.cogdev.2007.09.004
18. CRISCO. Dictionnaire électronique des synonymes. 1998. Available: http://www.crisco.unicaen.fr/des/
19. Manguin J-L. Les requêtes sur un site Web: un corpus pour étudier la variation orthographique (in French). Proceedings of 6èmes journées de linguistique de corpus. Bretagne Sud University, Lorient; 2009.
20. Romary L, Salmon-Alt S, Francopoulo G. Standards going concrete: from LMF to Morphalou. Geneva, Switzerland; 2004.
21. MORPHALOU. Lexique morphologique ouvert du français. Available: http://www.cnrtl.fr/lexiques/morphalou/
22. Damerau FJ. A Technique for Computer Detection and Correction of Spelling Errors. Commun ACM. 1964;7: 171–176. doi: 10.1145/363958.363994
23. New B, Pallier C, Brysbaert M, Ferrand L. Lexique 2: A new French lexical database. Behav Res Methods Instrum Comput. 2004;36: 516–524. doi: 10.3758/bf03195598 15641440
24. McGraw KO, Wong SP. Forming inferences about some intraclass correlation coefficients. Psychol Methods. 1: 30–46.
25. Williams EJ. The Comparison of Regression Variables. J R Stat Soc Ser B Methodol. 1959;21: 396–399. doi: 10.1111/j.2517-6161.1959.tb00346.x
26. Steiger JH. Tests for comparing elements of a correlation matrix. Psychol Bull. 87: 245–251. doi: 10.1037/0033-2909.87.2.245
27. Lété B, Peereman R, Fayol M. Consistency and word-frequency effects on spelling among first- to fifth-grade French children: A regression-based study. J Mem Lang. 2008;58: 952–977. doi: 10.1016/j.jml.2008.01.001
28. Lété B, Sprenger-Charolles L, Colé P. MANULEX: A grade-level lexical database from French elementary school readers. Behav Res Methods Instrum Comput. 2004;36: 156–166. doi: 10.3758/bf03195560 15190710
29. Coltheart M, Davelaar E, Jonasson JT, Besner D. Access to the internal lexicon. Dornic S (Ed.). Attention and Performance. Dornic S (Ed.). New York: Academic Press; 1977. pp. 535–555.
30. Pacton S, Sobaco A, Fayol M, Treiman R. How does graphotactic knowledge influence children’s learning of new spellings? Front Psychol. 2013;4. doi: 10.3389/fpsyg.2013.00701 24109466
31. Sobaco A, Treiman R, Peereman R, Borchardt G, Pacton S. The influence of graphotactic knowledge on adults’ learning of spelling. Mem Cognit. 2015;43: 593–604. doi: 10.3758/s13421-014-0494-y 25537953
32. Bar-On A, Kuperman V. Spelling errors respect morphology: a corpus study of Hebrew orthography. Read Writ. 2019;32: 1107–1128. doi: 10.1007/s11145-018-9902-1
33. Schmitz T, Chamalaun R, Ernestus M. The Dutch verb-spelling paradox in social media. Linguist Neth. 2018;35: 111–124. doi: 10.1075/avt.00008.sch
34. Pacton S, Fayol M, Nys M, Peereman R. Implicit Statistical Learning of Graphotactic Knowledge and Lexical Orthographic Acquisition. Spell Writ Words. 2019; 41–66. doi: 10.1163/9789004394988_004
Článek vyšel v časopise
PLOS One
2019 Číslo 12
- S diagnostikou Parkinsonovy nemoci může nově pomoci AI nástroj pro hodnocení mrkacího reflexu
- Je libo čepici místo mozkového implantátu?
- Pomůže v budoucnu s triáží na pohotovostech umělá inteligence?
- AI může chirurgům poskytnout cenná data i zpětnou vazbu v reálném čase
- Nová metoda odlišení nádorové tkáně může zpřesnit resekci glioblastomů
Nejčtenější v tomto čísle
- Methylsulfonylmethane increases osteogenesis and regulates the mineralization of the matrix by transglutaminase 2 in SHED cells
- Oregano powder reduces Streptococcus and increases SCFA concentration in a mixed bacterial culture assay
- The characteristic of patulous eustachian tube patients diagnosed by the JOS diagnostic criteria
- Parametric CAD modeling for open source scientific hardware: Comparing OpenSCAD and FreeCAD Python scripts
Zvyšte si kvalifikaci online z pohodlí domova
Všechny kurzy