Doing empirical research with the help of Netlogo and R

Gerard Vreeswijk

Introduction

Empirical research is research by doing experiments. It turns out that it is extremely easy to organise AI experiments with the help of Netlogo and R.

Netlogo and R

Behaviour space

Although Netlogo and R come from different cultures they can be made to work together in a very productive way. To begin with, Netlogo faciliates the organisation of experiments by means of something that is called BehaviorSpace. Netlogo's BehaviorSpace is an empty suite that can be filled with experiments. Each experiment is a collection of runs. This collection of runs is the result of asking Netlogo to run your program with different parameter settings. In doing so the different program instantiations run parallel on all CPU cores and no output is visible. Further explanation on Netlogo's BehaviorSpace can best be taken from Netlogo's manual itself.

Save as table

When you ask BehaviorSpace to execute an experiment, BehaviorSpace asks you how to save its output. In doing so it offers two options, viz. “spreadsheet output” and “table output”. By default, “spreadsheet output” is ticked and “table output” is not ticked. Since I always use table output for further statistical processing, I un-tick “spreadsheet output” (for efficiency) and tick “table output” (for results). Then, Netlogo asks you to come up with a file name that ends in .csv (comma separated value) to store the table in. Say you choose my_results.csv. After the experiment has ended, my_results.csv might look like this:

"BehaviorSpace results (NetLogo <version>)"
"<name of your netlogo program>.nlogo"
"<name of your experiment>"
"<date and time experiment was executed>"
"min-pxcor","max-pxcor","min-pycor","max-pycor"
"0","11","0","11"
"[run number]","[step]","independent-1","independent-2","dependent"
"1","1000","94","43","3.4413"
"3","1000","93","42","2.5244"
"4","1000","97","40","1.4333"
"2","1000","94","43","3.3132"
"6","1000","89","21","1.1313"
"5","1000","97","40","1.7511"
....
and so forth

Let me explain the structure of this table somewhat. The first 6 lines (line 6 is "0","11","0","11") are not interesting for statistical processing. Line 7 and further represent the actual table of experiment results. Line 7 contains the header of this table: run number surrounded by brackets because it is system provided information, step number (also system provided information), two input variables “independent-1” and “independent-2” and one output variable “dependent”. So for example nr. run 2 stopped at step 1000, and the values of the input and outputvariables are 94, 43, and 3.3132, respectively. The run numbers do not monotonically increase because runs excute in parallel (at least when you ask for it). In our case, run nr. 1 finished first then run nr. 3, then run nr. 4, and so on.

The present table may then be further processed by R.

R

A typical R script to process the table above looks like the following:

#!/usr/bin/Rscript -w
 
my_data = read.csv("my_results.csv", skip=6) # Read comma separated Netlogo output
                                             # and ignore the first 6 lines.
                                             # Always ignore the first 6 lines of Netlogo table output.
 
pdf("my_results.pdf") # Write results in PDF format.  Many other formats are possible.
 
# Do interesting things with 'my_data'.  Then produce nice plots of these results which will
# then automatically written to 'my_results.pdf'.
 
dev.off() # Finalise PDF

I have assumed that you have installed R and Rscript is in your path. Of course the “do interesting thing with my_data” part needs further clarification, so here is an elaborated example.

A worked example

In this example, we will add an experiment to an existing Netlogo program, run this experiment, process the results in R, and make a nice plot.

So there you have it. Now go and write your own R scripts. Sure this will not be easy. R has a steep learning curve and many times you will be searching the net for even the most stupid R tasks. But in the end R can do almost anything you want, so it is not so much a matter of possibilities than a matter of persistence and perseverence.

In this page I explained how Netlogo's output can processed by R scripts. But of course there are other ways to couple R and Netlogo. To begin with, just like there are Netlogo extensions for sound, associative arrays, GIS, and video, so is there a Netlogo extension for R, called R-extension. This extension is further described in Thiel et al.'s article Agent-Based Modelling: Tools for Linking NetLogo and R. Also worth looking at is the RNetLogo package for running NetLogo inside R. However, before adopting such tools I suggest to study the approach explained here first. After that you are in a better position to assess which tools might help you to address your research problem.

For a more general treatment on empirical methods in AI, see Cohen's Empirical Methods for Artificial Intelligence, Walsh's How not to do it, and Beck et al.'s Five pitfalls of empirical scheduling research. Good luck!


This page last modified at Fri, 04 Mar 16 16:12:36 +0100.