Python Data Analysis in Cognitive Science: maj 2016

söndag 8 maj 2016

An introduction to Data Analysis with Python

In this great 10.25 minutes you will be introduced to doing data analysis using Python. Very nice. A must see!

fredag 6 maj 2016

Great summary of Python and R - a short note

I found this post: R Vs Python For Datascience and it is very interesting. In this post R and Python are compared. You get to learn the pros and the cons of the two languages. I personally like Python because it is a general-purpose language. However, there are so many more (really, there are so many more) packages for statistics in R.

You should really read the post though. It is really helpful. The image above is linked to the site. The conclusion is that there is no winner. You yourself have to choose language. I chose both. What will you choose?

onsdag 4 maj 2016

Three great Python books to get you going (they are free)

In this short post I am going to share three great, and free, Python books. I think that you could easily go from the first I will list (Think Python) to the last (Think Bayes) and you will have started your journey into Python and data science really nicely. You will have a lot of knowledge to go further with! The three books are written by Allen Downey.

Think Python

Think Python: How to Think As A Computer Scientist are a great starter for someone that want to learn Python. The first edition uses Python 2 and the second edition uses Python 3. If you are completely new to Python I suggest that you go for the second edition. This book introduces beginners to the Python language.

Think Stats

You are of course interested in statistics and Python, right? Then you should go on to Think Stats. As with Think Python there are a first edition and a second edition for Python 2 and 3, respectively. Think stats introduces you to exploratory statistics using Python. Really handy books and you will learn a lot. Both on computation and statistical programming (i.e., in Python).

Think Bayes

Think Bayes introduces you to bayesian statistics. Bayesian statistics is really up and coming in the cognitive sciences. It offers very intuitive interpretation (p-values are not intuitive!). As for now you have to read a book written for Python 2. However, you will find updated code on Allen Downey's github page: updated code. I would suggest that you read the book but look on the new code and learn how to do it in Python 3.

There are probably a bunch of more free books out there for learning Python and statistics. I stick with these three in this post because they are short and you will learn so much from them.

That is it for now, take care.

tisdag 3 maj 2016

More on learning how to code Python - a Cognitive scientists journey to coding

In this post I will continue the discussion on programming in Python for cognitive scientists. I will go from a perspective of data collection, to analysis, and finally to writing your results up (yes, you can basically use python for all these tasks!)

Collecting data

There are several ways you may collect data as a cognitive scientist. All depends on your research question(s). I will in this post only discuss two; collecting online data using social media and/or questionnaires, and collecting data using laboratory experiments. In fact, I will barely mention the first one but you can scrape a lot of behavior:ish data off, for instance, Twitter and maybe throw in questionnaires in that.

Creating experiments to use in data collection

Programming, or building, experiments have for long been carried out with crappy and expensive tools (e.g., E-prime). Although I understand the attraction in simpler experiment building tools where you drag and drop objects. When you are finished building your experiment you generate a script by pressing a button. All fine. However, you may at times need to do more advanced stuff and then you will need add stuff like inline code (e.g., write some scripts and add to the "timeline" in the builder). Recently, it has appeared a couple of free and open-source Python tools for creating experiments. Two of them, PsychoPy and OpenSesame, offers builders and inline scripting (much like e-prime).

OpenSesame builder GUI

Some of the others just gives you an API to ease some of the coding of your experiment (PsychoPy can be used as a library, also). That is, you import it as it was any other Python library (after you have installed it, of course). For instance, if you use the Python library Expyriment you will import what you need from the library:

from expyriment import design, control, stimuli, io, misc

On Expyriments website you can find some beginner's tutorials.

If you are interested in using PsychoPy's builder mode you can watch the following youtube tutorial:

In this tutorial you will learn how to create a classical psychology experiment; the stroop task (of course, in its original form pen and paper were used...). For a psycho-linguistic researcher the following tutorial may be more adequate:

More resources on Psychopy can be found on the software's resources page. You will find out that coding using the library of Psychopy (e.g., importing the stuff you need for your experiment from the PsychoPy library is much like the short Expyriment example above).
When you have learned how to create and code your own experiment in Python you will be able to collect a lot of data you probably want to analyze your data. Although MATLAB, and more recently, R have had the majority of the cognitive science crowd when it comes to analysis (you can also create experiment in MATLAB using psychotoolbox and such) you can OF COURSE do your analysis in Python.

Data analysis

Common statistical methods in Psychology, and related fields, are linear regression, t-test, and analysis of variance (ANOVA). Especially when it comes to experiments when doing more subjective survey studies other techniques such as factor analysis (FA) and structural equation modelling (SEM) are carried out. Of course, an experimental design may also need such multivariate analysis'. If you are interested in FA and SEM in Python I must disappoint you here, however. As far as I know you can only carry out principal component analysis (which is not 'real' factor analysis according to my old stats teacher!)

Enough of my rambling you say? What CAN I do in Python?! Well you CAN do t-tests, linear regression (non-linear also), ANOVA, etc. For instance, using the package Statsmodels we can carry out all of the methods (except for repeated measures ANOVA, however). Sci-kit learn, a machine learning library, can also do a lot of statistics. Of course, SciPy can do some basic parameteric tests and Pandas (and SciPy and NumPy) can carry out most descriptive statistics you'd want to have. Repeated measures ANOVA can be carried out using the package Pyvttbl which, sadly, seem to be un-maintained. No more updates of that...

That is it, most of the stuff I list here I found via this excellent site: Python and R as tools for data analysis and creating Psychology experiments. If you follow the link you will find discussions on Python IDEs for Psychology researchers (or any other scientist), how to do ANOVA for repeated and dependent measures, and some descriptive statistics. All in Python.

That is it for me now.

Please leave a comment if you have any suggestions!

måndag 2 maj 2016

Bokeh Tutorial

In this great Tutorial Video you get to learn how to use Bokeh to create interactive visualisations.

Python Data Analysis in Cognitive Science