Posts


Feb. 28, 2021

Configuring Python to work like ESS mode in Spacemacs

I have found one of the best packages in emacs to be ESS mode which allows me to work seamlessly with R. I much prefer the setup and workflow of ESS than R Studio which I had previously done most of my R coding in. However, I spend most of my time coding in Python in jupyter notebooks. I have tried on several occasions to switch to using jupyter notebooks inside emacs, but it has never quite clicked for me and I’ve always gone back to just working in a browser.

Aug. 25, 2019

Optimal Data Decompression in R

I’ve recently encountered a situation where I am working with very large datasets in a constrained environment. As a result, the only practical option has been to store the datasets in a compressed format, and then load them into R to start on the data analysis. The problem is that when your working with datasets in the 10gb+ range on a normal desktop loading the data into memory (thankfully there is still enough of that available!

May. 23, 2017

Monitoring memory usage in jupyter notebooks

I recently encountered a series of memory errors when using pandas in the LSE high powered computing environment. While detailing the problem, isn’t going to be of particular interest to any one I thought that a quick run down of how to monitor memory usage in a jupyter notebook may be of use to a few people out there. While there are a few different ways of doing this I found that using the package memory profiler was by far the easiest option.

Mar. 23, 2017

Working with deciles and timeseries in python and pandas

Recently I’ve been working with the Land Registry’s price paid data set looking at shifts in prices in different areas of the market. One of the ways I’ve been segmenting the massive amounts of data into something more manageable has been to look at specific deciles, say the top 10% of the market. Deciles, and the like have been put to great use recently in the literature on income and wealth, ‘the 1%’ as a phrase we all now instantly ‘get’ being the perfect example.

Nov. 21, 2015

Academic PDF management with Zotero

Nowadays there are a plethora of reference managers competing for the attention of academics. While each has its own pros and cons, Zotero stands out as the best tool to use for managing a digital library of PDFs. Zotero is free, open source and maintained by academics – a real advantage at time when most other reference managers are now run for profit or owned by large journal publishers. Moreover, Zotero is perfectly suited to helping you seamlessly manage your PDFs in a manner which suits you, without the potential of costly upgrades, being locked out when you change institution or the company behind the product goes bust.