CC resources

ICCC corpus from years 2010 to 2015
Proceedings - one folder per year , one document per article, references are delimited by sign </references_biblio>.
Proceedings in named lined document format: .txt file - one line per document, starting by ID, followed by proceedings year, articles without reference sections.

CC terminology- top 1500 single and multiword terms
CC terminology- top 1500 multiword terms

Ontologies Ontologies were constructed using OntoGen tool by Fortuna et al.
Initial onology (.rdf, .png): completely automated clustering of ICCC documents
Named clusters (.rdf, .png): initial onology by manual naming of clusters
Improved (.rdf,.png): semi automated ontology (combined automated constrction and manual improvement)
Years (.rdf,.png): characteristic and distinctive keywords by years (category names: first three dictinctive (SVM) keywords)