Akcije

Telfor Journal
kako citirati ovaj članak
podeli ovaj članak

Metrika

  • citati u SCIndeksu: 0
  • citati u CrossRef-u:0
  • citati u Google Scholaru:[]
  • posete u poslednjih 30 dana:0
  • preuzimanja u poslednjih 30 dana:0

Sadržaj

članak: 1 od 1  
2014, vol. 6, br. 1, str. 64-68
Evaluation and classification of syntax usage in determining short-text semantic similarity
(naslov ne postoji na srpskom)
Univerzitet u Beogradu, Elektrotehnički fakultet

e-adresabv115045p@student.etf.rs, bojic@etf.rs
Projekat:
Razvoj hardverske, softverske i telekomunikacione infrastrukture e-sistema za kontrolu prometa i poreza (MPNTR - 32047)

Ključne reči: natural language processing; MSRPC; parsing; part-of-speech tagging; semantic role labeling; short-text semantic similarity; syntax; word order
Sažetak
(ne postoji na srpskom)
This paper outlines and categorizes ways of using syntactic information in a number of algorithms for determining the semantic similarity of short texts. We consider the use of word order information, part-of-speech tagging, parsing and semantic role labeling. We analyze and evaluate the effects of syntax usage on algorithm performance by utilizing the results of a paraphrase detection test on the Microsoft Research Paraphrase Corpus. We also propose a new classification of algorithms based on their applicability to languages with scarce natural language processing tools.
Reference
Barzilay, R., McKeown, K.R. (2005) Sentence Fusion for Multidocument News Summarization. Computational Linguistics, 31(3): 297-328
Blacoe, W., Lapata, M. (2012) A comparison of vector-based representations for semantic composition. u: Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2012 Joint Conference, Proceedings, pp. 546-556
Charniak, E., Johnson, M. (2005) Coarse-to-fine n -best parsing and MaxEnt discriminative reranking. u: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics - ACL '05, str. 173-180
Dolan, B., Quirk, C., Brockett, C. (2004) Unsupervised construction of large paraphrase corpora. u: Proceedings of the 20th international conference on Computational Linguistics - COLING '04, Article No. 350
Fernando, S., Stevenson, M. (2008) A semantic similarity approach to paraphrase detection. u: 11th Annual Research Colloquium of the UK Special Interest Group for Computational Linguistics, Proceedings, pp. 45-52
Furlan, B., Sivački, V., Jovanović, D., Nikolić, B. (2011) Comparable evaluation of contemporary corpus-based and knowledge-based semantic similarity measures of short texts. Journal of Information Technology and Applications, vol. 1, no. 1, pp. 65-72
Harabagiu, S.M., Maiorano, S.J., Pasca, M.A. (2003) Open-domain textual question answering techniques. Natural Language Engineering, 9(03): 231-267
Harris, Z. (1954) Distributional structure. Word, vol. 10, no. 23, pp. 146-162
Islam, A., Inkpen, D. (2008) Semantic text similarity using corpus-based word similarity and string similarity. ACM Transactions on Knowledge Discovery from Data, 2(2): 1-25
Li, L., Zhou, Y., Yuan, B., Wang, J., Hu, X. (2009) Sentence Similarity Measurement Based on Shallow Parsing. u: Sixth International Conference on Fuzzy Systems and Knowledge Discovery, str. 487-491
Lim, S., Lee, C., Ra, D. (2013) Dependency-based semantic role labeling using sequence labeling with a structural SVM. Pattern Recognition Letters, 34(6): 696-702
Lintean, M., Rus, V. (2012) Measuring semantic similarity in short texts through greedy pairing and word semantics. u: Twenty-Fifth International Florida Artificial Intelligence Research Society Conference, Proceedings, pp. 244-249
Liu, X., Zhou, Y., Zheng, R. (2008) Measuring semantic similarity within sentences. u: International Conference on Machine Learning and Cybernetics, str. 2558-2562
Manning, C.D. (2011) Part-of-Speech Tagging from 97% to 100%: Is It Time for Some Linguistics?. u: Proceedings of the 12th International Conference on Computational Linguistics and Intelligent Text Processing, str. 171-189
Marcus, M.P., Marcinkiewicz, M.A., Santorini, B. (1993) Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics, Jun, vol. 19, no. 2, pp. 313-330
Mihalcea, R., Corley, C., Strapparava, C. (2006) Corpus-based and knowledge-based measures of text semantic similarity. u: Artificial Intelligence, (21st) National Conference, Proceedings, pp. 775-780
Miller, G.A. (1995) WordNet: a lexical database for English. Communications of the ACM, 38(11): 39-41
Ming, C.L., Jia, W.C., Tung, C.H., Tzone, I.W., Chien, Y.S., Hui, H.C., Ching, H.C. (2012) A Syntactic Based Approach for Evaluating Semantics of Texts. International Journal of Advancements in Computing Technology, 4(21): 220-229
Oliva, J., Serrano, J.I., Castillo, M.D.del, Iglesias, Á. (2011) SyMSS: A syntax-based measure for short-text semantic similarity. Data and Knowledge Engineering, 70(4): 390-405
Ramage, D., Rafferty, A.N., Manning, C.D. (2009) Random walks for text semantic similarity. u: Proceedings of the 2009 Workshop on Graph-based Methods for Natural Language Processing - TextGraphs-4, str. 23-31
Wiemer-Hastings, P. (2004) All parts are not created equal: SIAM-LSA. u: 26th Annual Conference of the Cognitive Science Society, Proceedings
 

O članku

jezik rada: engleski
vrsta rada: neklasifikovan
DOI: 10.5937/telfor1401064B
objavljen u SCIndeksu: 15.05.2015.

Povezani članci

Nema povezanih članaka