Sentiment analysis and Parts of Speech tagging in Dutch/French/English/German/Spanish/Italian

As part of our continuing effort to digitise poetry and to automate new forms of poetry, we released an R package called pattern.nlp, which is available at https://github.com/bnosac/pattern.nlp . It allows R users to do sentiment analysis and Parts of Speech tagging for text written in Dutch, French, English, German, Spanish or Italian. Of course this can also be used for other purposes like data preparation as part of a topic modelling flow.

pattern nlp logo

If you are interested in text mining, feel free to register for the text mining courses listed at our last blog post.

If you just want to do sentiment analysis and POS tagging in these 5 European languages, go ahead as follows. Sentiment analysis is available for Dutch, French & English.

library(pattern.nlp)

## Sentiment analysis
x <- pattern_sentiment("i really really hate iphones", language = "english")
y <- pattern_sentiment("de wereld is een mooie plaats, nietwaar sherlock", language = "dutch")
z <- pattern_sentiment("j'aime Paris, c'est super", language = "french")
rbind(x, y, z)

polarity subjectivity id
-0.80 0.90 i really really hate iphones
0.70 1.00 de wereld is een mooie plaats, nietwaar sherlock
0.65 0.75 j'aime Paris, c'est super

Parts of Speech tagging is available for Dutch, French, English, Spanish & Italian.

library(pattern.nlp)

x <- "Il pleure dans mon coeur comme il pleut sur la ville. Quelle est cette langueur qui penetre mon coeur?"
pattern_pos(x = x, language = 'french')

x <- "Avevamo vegliato tutta la notte - i miei amici ed io sotto lampade
di moschea dalle cupole di ottone traforato, stellate come le nostre anime,
perché come queste irradiate dal chiuso fulgòre di un cuore elettrico."
pattern_pos(x = x, language = 'italian')

pos example1

 

We are also working on a Dutch wordnet - which will be fully released in due date. More information at https://github.com/weRbelgium/wordnet.dutch.Hope you use the package for spreading new languages!

Text Mining with R - upcoming training schedule

Part of the R course offering of BNOSAC which you can find at http://bnosac.be/images/bnosac/bnosac_courses_r.pdf, we offer several 2-day hands-on courses covering the use of text mining tools for the purpose of data analysis. It covers basic text handling, natural language engineering and statistical modelling on top of textual data.

tm predictive
Interested in upgrading your skills on text mining with R? Registering can be done for the following days.

2016: October 24-25: subscribe at https://lstat.kuleuven.be/training/coursedescriptions/text-mining-with-r
2016: November 14-15: subscribe at http://di-academy.com/event/text-mining-with-r/
2017: March 23-24: subscribe at https://lstat.kuleuven.be/training/coursedescriptions/text-mining-with-r

The following elements are covered in this course.

  1. Import of (structured) text data with focus on text encodings. Detection of language
  2. Cleaning of text data, regular expressions
  3. String distances
  4. Graphical displays of text data
  5. Natural language processing: stemming, parts-of-speech (POS) tagging, tokenization, lemmatisation, entity recognition
  6. Sentiment analysis
  7. Statistical topic detection modelling and visualisation (latent dirichlet allocation)
  8. Automatic classification using predictive modelling based on text data
  9. Visualisation of correlations & topics
  10. Word embeddings
  11. Document similarities & Text alignment

Hope to see you there.

Good news from Belgium: Course on Applied spatial modelling with R (April 13-14)

applied spatial

Within 2 weeks, our 2-day crash course on Applied spatial modelling with R (April 13-14, 2016) will be given at the University of Leuven, Belgium: https://lstat.kuleuven.be/training/applied-spatial-modelling-with-r
You'll learn during this course the following elements:

  • The sp package to handle spatial data (spatial points, lines, polygons, spatial data frames)
  • Importing spatial data and setting the spatial projection
  • Plotting spatial data on static and interactive maps
  • Adding graphical components to spatial maps
  • Manipulation of geospatial data, geocoding, distances, …
  • Density estimation, kriging and spatial point pattern analysis
  • Spatial regression

More information: https://lstat.kuleuven.be/training/applied-spatial-modelling-with-r. Registration can be done at https://lstat.kuleuven.be/forms/courses

applied spatial model

New RStudio add-in to schedule R scripts

With the release of RStudio add-in possibilities, a new area of productivity increase and expected new features for R users has arrived. Thanks to the help of Oliver who has written an RStudio add-in on top of taskscheduleR, scheduling and automating an R script from RStudio is now exactly one click away if you are working on Windows.

How? Just install these R packages and you have the add-in ready at the add-in tab in your RStudio session. Select your R script and schedule it to run any time you want. Hope this saves you some day-to-day time and feel free to help make additional improvements. More information: https://github.com/bnosac/taskscheduleR.

install.packages('data.table')
install.packages('knitr')
install.packages('miniUI')
install.packages('shiny')
install.packages("taskscheduleR", repos = "http://www.datatailor.be/rcube", type = "source")

taskscheduleR rstudioaddin

 

taskscheduleR: R package to schedule R scripts with the Windows task manager

If you are working on a Windows computer and want to schedule your R scripts while you are off running, sleeping or having a coffee break, the taskscheduleR package might be what you are looking for. 

taskscheduleR logo

The taskscheduleR R package is available at https://github.com/bnosac/taskscheduleR and it allows R users to do the following:

i) Get the list of scheduled tasks

ii) Remove a task

iii) Add a task

    - A task is basically a script with R code which is run through Rscript

    - You can schedule tasks 'ONCE', 'MONTHLY', 'WEEKLY', 'DAILY', 'HOURLY', 'MINUTE', 'ONLOGON', 'ONIDLE'

    - After the script has run, you can check the log which can be found at the same folder as the R script. It contains the stdout & stderr of the Rscript.

Below, you can find an example how you can schedule your R script once or daily in the morning. 
library(taskscheduleR)
myscript <- system.file("extdata", "helloworld.R", package = "taskscheduleR")

## run script once within 62 seconds
taskscheduler_create(taskname = "myfancyscript", rscript = myscript,
schedule = "ONCE", starttime = format(Sys.time() + 62, "%H:%M"))
## run script every day at 09:10
taskscheduler_create(taskname = "myfancyscriptdaily", rscript = myscript,
schedule = "DAILY", starttime = "09:10")

## delete the tasks
taskscheduler_delete(taskname = "myfancyscript")
taskscheduler_delete(taskname = "myfancyscriptdaily")
  • When the task has run, you can look at the log which contains everything from stdout and stderr. The log file is located at the directory where the R script is located. 
## log file is at the place where the helloworld.R script was located
system.file("extdata", "helloworld.log", package = "taskscheduleR")

Who wants to set up an RStudio add-in for this?