Web scraping with R & novel classification algorithms on unbalanced data

Tomorrow, the next RBelgium meeting will be held at the bnosac offices. This is the schedule.

Interested? Feel free to join the event. More info: http://www.meetup.com/RBelgium/events/228427510/

• 18h00-18h30: enter & meet other R users

• 18h30-19h00: Web scraping with R: live scraping products & prices of www.delhaize.be

600 446999774.jpeg

• 19h15-20h00: State-of-the-art classification algorithms with unbalanced data. Package unbalanced: Racing for Unbalanced Methods Selection.

600 446999829.jpeg

Advanced R programming topics course in Leuven

wawitsr Last call for registration of the course on Advanced R programming topics.

Next week on February 17/18, the yearly R course on Advanced R programming topics in Leuven (Belgium) is scheduled.
Registration can be done at https://lstat.kuleuven.be/training/coursedescriptions/AdvancedprogramminginR.html.

The course is rewritten based on 3 years of extensive customer feedback and because of the tremendous evolution R has encountered in the last years.
You'll learn the following in this 2-day course:

functions and vectorisation
control flow
data handling using data.table (aggregation, rbinding, reshaping)
apply family of functions, split/apply/combine
parallelisation
error handling & debugging
building reports with latex & Sweave
building reports with latex/markdown and knitr
S3 & S4 classes, methods & generics
environments, search path, namespaces
creating your own R package
documenting your R package with Roxygen
building a vignette
R CMD check/build/install
unit testing of your functions
building your own corporate R package repository

Open Data in Belgium - release of BelgiumStatistics R package

On 22/10/2015, the Belgium government launched its Open Data initiative by releasing a number of datasets related to population statistics, fiscal information, 'kadaster', the 2011 census and some tools. Because BNOSAC works a lot with these kind of data and because we like to promote open data, an R package called BelgiumStatistics was made available for R users at https://github.com/jwijffels/BelgiumStatistics

opendatadataset

The package contains all the datasets released by Statistics Belgium (Bevolking, Werk, Leefmilieu, Census 2011) under the 'Licentie open data'. Readily available to R users. Thanks to the open data, analysing and visualising Belgium data has now become a lot smoother as the example below shows.

require(BelgiumStatistics)
require(data.table)
require(BelgiumMaps)
require(leaflet)
data(TF_SOC_POP_STRUCT_2015) ## Part of BelgiumStatistics
data(mapbelgium.fusiegemeenten.wgs) ## Part of BelgiumMaps (not released yet)

x <- as.data.table(TF_SOC_POP_STRUCT_2015)
x <- x[, list(MS_POPULATION = sum(MS_POPULATION),
              Foreigners = sum(MS_POPULATION[TX_NATLTY_NL == "Vreemdelingen"]) / sum(MS_POPULATION),
              Age = 100 * sum(MS_POPULATION * CD_AGE) / sum(MS_POPULATION),
              Females = 100 * sum(MS_POPULATION[CD_SEX == "F"]) / sum(MS_POPULATION)),
       by = list(CD_MUNTY_REFNIS, TX_MUNTY_DESCR_NL)]
x <- setDF(x)

mymap <- merge(mapbelgium.fusiegemeenten.wgs,
               x, by.x = "ORDER08", by.y = "CD_MUNTY_REFNIS", all.x=TRUE, all.y=FALSE)
mymap <- subset(mymap, !is.na(Foreigners))
pal <- colorNumeric(palette = "Blues", domain = mymap$Foreigners)
leaflet(mymap) %>%
  addTiles() %>%
  addPolygons(stroke = FALSE, smoothFactor = 0.2, fillOpacity = 0.85, color = ~pal(Foreigners))

beplot

If you are interested in geographical analysis or visualisations, Get in touch.

Text Mining with R

Last week, we had a great course on Text Mining with R at the European Data Innovation Hub. For persons interested in text mining with R, another 1-day crash course is scheduled at the Leuven Statistics Research Center (Belgium) on November 17 (http://lstat.kuleuven.be/training/coursedescriptions/text-mining-with-r). The following elements are covered in the course.

Import of (structured) text data with focus on text encodings. Detection of language
Cleaning of text data, regular expressions
String distances
Graphical displays of text data
Natural language processing: stemming, parts-of-speech (POS) tagging, tokenization, lemmatisation, entity recognition
Sentiment analysis
Statistical topic detection modelling and visualisation (latent dirichlet allocation)
Automatic classification using predictive modelling based on text data

topicplot

More information on the course & the registration: http://lstat.kuleuven.be/training/coursedescriptions/text-mining-with-r

If you are interested in applying Text Mining techniques on your data, get in touch: index.php/contact/get-in-touch

R courses on basic R, advanced R, statistical machine learning with R, text mining with R, spatial modelling with R and R package building

Waw, our course list for teaching R is getting bigger and bigger. We have now courses on basic, R, advanced R, R package building, statistical machine learning with R, text mining with R and spatial analysis with R. All face-to-face courses given in Belgium and scheduled in the coming months.

EducateR brandon

Some courses are given at the European Data Innovation Hub (Brussels, Belgium), other courses are given through the Leuven Statistics Research Center (Leuven, Belgium). From today on, you can register for the following courses regarding the use of R.
Prices are set to 300€ per course day + taxes.
For detailed information on the course content, have a look at the pdf which can be found here.

Courses given at the European Data Innovation Hub (Brussels, Belgium) - http://www.datainnovationhub.eu

Introduction to R programming: 07-08/09/2015 (2 days) (register at www.eventbrite.com/e/common-data-manipulation-for-r-programmers-tickets-17938118395)
Common data manipulation for R programmers: 14/09/2015 (1 day) (register at www.eventbrite.com/e/common-data-manipulation-for-r-programmers-tickets-17938118395)
Statistical Machine Learning with R: 28-29/09/2016 (2 days) (register at www.eventbrite.com/e/statistical-machine-learning-with-r-tickets-17938785390)
Text mining with R: 05/10/2015 (1 day) (register at www.eventbrite.com/e/text-mining-with-r-tickets-17939078266)
Reporting with R: 02/11/2015 (1 day) (register at www.eventbrite.com/e/reporting-with-r-tickets-17938500538)
Creating R packages and R repositories: 03/11/2015 (1 day) (register at www.eventbrite.com/e/training-creating-r-packages-and-r-repositories-tickets-18168431267)

Courses given at LStat (Leuven, Belgium) - http://lstat.kuleuven.be/training/index.htm

Statistical Machine Learning with R: 28-29/10/2015 (2 days) (register at lstat.kuleuven.be/training/coursedescriptions/statistical-machine-learning-with-r)
Text mining with R: 17/11/2015 (1 day) (register at lstat.kuleuven.be/training/coursedescriptions/text-mining-with-r)
Advanced R programming topics: 17-18/02/2016 (2 days) (register at lstat.kuleuven.be/training/coursedescriptions/AdvancedprogramminginR.html)
Applied Spatial Modelling with R: 13-14/04/2016 (1.5 days) (register at lstat.kuleuven.be/training/applied-spatial-modelling-with-r)

Hope to see you soon.

PS. Thanks Brandon for allowing us to use your wonderfull logo

Web scraping with R & novel classification algorithms on unbalanced data

Advanced R programming topics course in Leuven

Open Data in Belgium - release of BelgiumStatistics R package

Text Mining with R

R courses on basic R, advanced R, statistical machine learning with R, text mining with R, spatial modelling with R and R package building

More Articles ...