Data Mining Scientific Research Paper

  • Andrews, R., Geva S. (1994). Rule extraction from a constrained error backpropagation MLP. Australian Conference on Neural Networks, Brisbane, Queensland 1994 (pp. 9–12).Google Scholar

  • Baeza-Yates, R., Ribeiro-Neto, B. (1999). Modern information retrieval. Addison-Wesley.Google Scholar

  • Chen, H.H. (2002). Multilingual summarization and question answering. Workshop on Multilingual Summarization and Question Answering, COLING’02, Taipeh, Taiwan 2002.Google Scholar

  • Chitashvili, R.J., Baayen, R.H. (1993). Word frequency distributions. In G. Altmann, L. Hřebíček (Eds.), Quantitative Text Analysis (pp. 54–135). Wvt: Trier.Google Scholar

  • Deerwester, S., Dumais, S.T., Landauer, T.K., Furnas, G.W., Harshman, R.A. (1990). Indexing by latent semantic analysis. Journal of the American Society of Information Science, 41 (6), 391–407.CrossRefGoogle Scholar

  • Dempster, A.P., Laird, N.M., Rubin, D.B. (1997). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, B, 39, 1–38.Google Scholar

  • Diederich, J., Kindermann, J., Leopold, E., Paaß, G. (2003). Authorship attribution with Support Vector Machines. Applied Intelligence, 19 (1–2), 109–123.Google Scholar

  • Dumais, S., Platt, J., Heckerman, D., Sahami, M. (1998). Inductive learning algorithms and representations for text categorization. In Proceedings of the 7th International Conference on Information and Knowledge Management (pp. 148–155). ACM.Google Scholar

  • Gövert, B., Lalmas, M., Fuhr, N. (1999). A probabilistic description-oriented approach for categorising Web documents. In Proceedings of CIKM-99, 8th ACM International Conference on Information and Knowledge Management, Kansas City, Missouri, 1999 (pp. 475–482). ACM.Google Scholar

  • Guiter, H. (1974). Les rélations fréquence — longueur — sens des mots (langues romanes et anglais), In XIV congresso internazionale di linguistica e filologia romanza (pp. 373–381). Napoli.Google Scholar

  • Hahn, U., Reimer, U. (1999). Knowledge-based text summarization. In: I. Mani, M. T. Maybury (Eds.), Advances in Automated Text Summarization (pp. 215–232). Cambridge, London: MIT-Press.Google Scholar

  • Hand, D., Mannila, H., Smyth, P (2001). Principles of data mining. MIT Press.Google Scholar

  • Hartigan, J.A. (1975). Clustering algorithms. New York: John Wiley.Google Scholar

  • Hastie T., Tibshirani, R., Friedman, J. (2001). The elements of statistical learning. New York: Springer.Google Scholar

  • Hofman, T. (2001). Unsupervised learning by probabilistic latent semantic analysis. Machine Learning, 42, 177–196.Google Scholar

  • Holmes, D.I. (1998). The evolution of stylometry in Humanities Scholarship. Literary and Linguistic Computing, 13 (3), 111–117.CrossRefGoogle Scholar

  • Holmes, D.I., Forsyth, R.S. (1995). The Federalist revisited: New directions in authorship attribution. Literary and Linguistic Computing, 10 (2), 111–127.CrossRefGoogle Scholar

  • Kohonen, T. (1980). Content-adressable memories. Springer.Google Scholar

  • Kohonen, T. (1995). Self-organising Maps. Springer.Google Scholar

  • Kosala, R. Blockeel, H. (2000). Web mining research: A Survey. In P.S. Bradley, S. Sarawagi, U.M. Fayyad (Eds.), SIGKDD Explorations: Newsletter of the Special Interest Group (SIG) on Knowledge Discovery & Data Mining, ACM, 2 (pp. 1–15). ACM Press.Google Scholar

  • Kraaij, W., Spitters, M., Hulth, A. (2002). Headline extraction based on a combination of uniand multidocument summarization techniques. In Proceedings of the ACL workshop on Automatic Summarization/Document Understanding Conference DUC 2002, June 2002, Philadelphia, USA.Google Scholar

  • Joachims, T. (1998a). Making large-scale SVM learning practical, Technical report University of Dortmund.Google Scholar

  • Joachims, T. (1998b). Text categorization with Support Vector Machines: learning with many relevant features. Proceedings of the 10th European Conference on Machine Learning, Springer Lecture Notes in Computer Science, Vol. 1398 (pp. 137–142). Springer.Google Scholar

  • Landauer, T.K., Dumais, S.T. (1997). A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review, 104 (2), 211–240.CrossRefGoogle Scholar

  • Lang, K. (1995). Newsweeder: Learning to filter netnews. In A. Prieditis, S. Russell (Eds.), Proceedings of the 12th International Conferrence on Machine Learning (pp. 331–339). San Francisco: Morgan Kaufmann Publishers.Google Scholar

  • Leopold, E., Kindermann, J. (2002). Text categorization with Support Vector Machines. How to represent texts in input space? Machine Learning, 46, 423–444.CrossRefGoogle Scholar

  • Lowe, D., Matthews, R. (1995). Shakespeare vs. Fletcher: A stylometric analysis by radial basis functions. Computers and the Humanities, 29, 449–461.CrossRefGoogle Scholar

  • Manning, C.D., Schütze, H.(1999). Foundations of statistical natural language processing. Cambridge MA, London: MIT Press.Google Scholar

  • Mitchell, Tom (1997). Machine Learning. Boston et al.: McGraw-Hill.Google Scholar

  • Mladenic, D., Grobelnik M. (1999). Feature selection for unbalanced class distribution and naive Bayes. In I. Bratko, S. Dzeroski (Eds.), Proceedings of the Sixteenth International Conference on Machine Learning (ICML 1999) (pp. 258–267). San Francisco: Morgan Kaufmann.Google Scholar

  • Neumann, G., Schmeier, S. (2002). Shallow natural language technology and text mining. Künstliche Intelligenz, 2002 (2), 23–26.Google Scholar

  • Neumann, G., Piskorski, J. (2002). A Shallow text processing core engine. Computational Intelligence, 18 (3), 451–476.CrossRefGoogle Scholar

  • Nigam, K., McCallum, A.K., Thrun, S., Mitchel, T. (1999). Text classification from labeled and unlabeled documents using EM. Machine Learning, 39 (1/2), 103–134.Google Scholar

  • Paaß, G., Leopold, E., Larson, M., Kindermann, J., Eickeler, S. (2002). SVM Classification using sequences of phonemes and syllables. Tapio Elomaa & Heikki Mannila & Hannu Toivonen (Eds.), Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD 2002); August 19–23, 2002 Helsinki, Finland, Lecture Notes in Artificial Intelligence 2431 (pp. 373–384) Berlin, Heidelberg: Springer.Google Scholar

  • Porter, M.F. (1980). An algorithm for suffix stripping. Program (Automated Library and Information Systems), 14 (3), 130–137.Google Scholar

  • Rudman, J. (1998). The state of authorship attribution studies: some problems and solutions. Computers and the Humanities, 31, 351–365.Google Scholar

  • Salton, G., McGill, M.J. 1983. Introduction to modern information retrieval. New York: McGraw Hill.Google Scholar

  • Shapire, R.E., Singer, Y. (2000). BoosTexter: a boosting based system for text categorization. Machine Learning, 39, 135–168.Google Scholar

  • Sparck-Jones, K. (1999). Automatic summarizing: factors and directions. In I. Mani, M.T. Maybury (Eds.), Advances in Automated Text Summarization.Google Scholar

  • Srivastava, J., Cooley, R., Deshpande, M., Tan, P.-N. (2000). Web usage mining: discovery and applications of usage patterns from web data, SIGKDD Exploratins, 1 (2), 12–23.Google Scholar

  • Stö ber, K., Wagner, P., Helbit, J., Köster, S., Stall, D., Thomae, M., Blauert, J., Hess, W., Hoffmann, R., Mangold, H. (2000). Speech synthesis by multilevel selection and concatenation of units from large speech Corpora. In: W. Wahlster (Ed.), Verb-mobil. Springer, 2000.Google Scholar

  • Stricker, M., Vichot, F., Dreyfus, G., Wolinski, F. (2000). Vers la conception de filtres ďinformations efficaces. In Reconnaissance des Formes et Intelligence Artificielle (RFIA’ 2000) (pp. 129–137).Google Scholar

  • Thisted, R., Efron, B. (1987). Did Shakespeare write a newly discovered poem? Biometrika, 74 (3), 445–55.Google Scholar

  • Thisted, R. (1988). Elements of statistical computing. London: Chapman&Hall.Google Scholar

  • Towsey, M., Diederich, J., Schellhammer, I., Chalup, S., Brugman, C. (1998). Natural language learning by recurrent neural networks: A comparison with probabilistic approaches. Computational natural language learning conference. Australian Natural Language Processing Fortnight. Sydney: Macquarie University, 15–17 Jan 1998.Google Scholar

  • Tweedie, F.J., Singh, S., Holmes, D.I. (1996). Neural network applications in stylometry: the federalist paper. Computers and the Humanities, 30, 1–10.CrossRefGoogle Scholar

  • van Rijsbergen, C.J. (1979). Information Retrieval. London, Boston: Butterworths.Google Scholar

  • Vapnik, V.N. (1998). Statistical Learning Theory. New York et al.: Wiley & Sons.Google Scholar

  • Weiss, S.M., Apt, C., Damerau, F., Johnson, D.E., Oles, F.J., Goetz, T., Hampp, T. (1999). Maximizing textmining performance. IEEE Intelligent Systems, 14 (4), 63–69.CrossRefGoogle Scholar

  • The 3rd International Workshop on Mining Scientific Publications will take place from the 8th to the 12th September in London, and is a cross-disciplinary workshop for researchers, industry practitioners, digital library developers, and open access enthusiasts. Kris Jack, Chief Data Scientist here at Mendeley is co-organizing the event along with CORE, the Open University, Athena Research and Innovation Center, and the European Library/Europeana .

    The aim is to bring together people from different backgrounds to explore the possibilities around data mining tools, and how they can be used to save researcher’s time by finding and processing huge amounts of information quickly and easily.

    We’re asking for submissions before the 13th July 2014 from those interested in analysing and mining databases of scientific publications, developing systems to enable such analysis, or designing new technologies to improve research and the free availability of research data. Researchers should submit their papers online, for inclusion in the programme. Both long papers (up to eight pages in the ACM style) and short papers (not exceeding four pages) are welcome, as are practical demonstrations and presentation of systems and methods (demonstration submissions should consist of a two-page description of the system, method or tool).

    “We’re looking to attract researchers from across academia and industry to work through the amazing possibilities and challenges around mining scientific content. The collaborations that come from these initiatives always yield really interesting results, so I’m looking forward to see what submissions we get through this year” says Kris

    The workshop will be structured around three main themes:

    1. The whole ecosystem of infrastructures, including repositories, aggregators, text-and data-mining facilities, impact monitoring tools, datasets, services and APIs that enable analysis of large volumes of scientific publications.
    2. Semantic enrichment of scientific publications by means of text-mining, crowdsourcing or other methods.
    3. Analysis of large databases of scientific publications to identify research trends, high impact, cross-fertilisation between disciplines, research excellence etc.

    This year, we also put together a CORE publications dataset containing a large array of publications from various research areas. This includes full-text as well as enriched versions of metadata, with the aim of providing workshop participants with a framework for developing and testing methods and tools around the workshop topics. You can access this data through the CORE portal.

    If you have any questions or comments, leave them below or tweet @WOSP2014

    0 Replies to “Data Mining Scientific Research Paper”

    Lascia un Commento

    L'indirizzo email non verrà pubblicato. I campi obbligatori sono contrassegnati *