Data Mining in de SAND
Datum: Dinsdag 20 juni @ 15:38:31 GMT+1
Onderwerp: Onderzoek


Mijn derde onderzoeksvraag luidt: "Wat zijn relevante afhankelijkheden tussen syntactische variabelen?" Om deze vraag te helpen beantwoorden, genereert mijn programma varc, het VARiable Correlation programma, alle relaties tussen alle variabelen in een Excel-compatibel bestand dat interactief bekeken kan worden.

De 13-kolommen tellende tabel bestaat uit:

  1. #From variable
  2. #To variable
  3. #Variable distance: The Hamming distance between two atomic variables based on their respective geographic distributions using the union set of locations where either one of the variables occurs.
  4. #Confidence: The ratio of locations included in the distance measurement. For example, a confidence value of 0.333 indicates that 89 of the 267 dialect locations were used to measure the distance between the two variables.
  5. #From example
  6. #To example
  7. #From context
  8. #To context
  9. #From Gloss
  10. #To Gloss
  11. #From Translation
  12. #To Translation
  13. #Reversed: Whether the row information describes the distance from FROM to TO or from TO to FROM. This is useful when sorting the data and the only reason why the data is presented bidirectionally.






Dit artikel komt van marco@work
http://marco.info/pro

De URL voor dit verhaal is:
http://marco.info/pro/modules.php?name=News&file=article&sid=175