::install_github("perechen/seetrees") devtools
seetrees
R package that enhances interpretability of stylometric results from stylo
The package seetrees
is a small (currently) extension to a well-known stylo
library. The latter does unsupervised and supervised classification of texts based on bags of features, and relies a lot on hierarchical clustering. I add some tree cutting and feature-to-cluster association measures, so that one can detect a high-level corpus bias, and get insight into which words might drive the clusterization.
I am not planning releasing the package on CRAN any time soon, it is mostly used for demonstration, teaching and exploration of the corpus. You can, however, install it from Github repository. You can find a demonstration below.
Clusters & features
Install from GitHub (make sure you have devtools
package):
library(stylo)
library(seetrees)
data(lee) ## load one of the stylo datasets
<- stylo(frequencies=lee,gui=F) stylo_res
view_tree(stylo_res, k=2,right_margin=12) ## redraws a dendrogram based on distance matrix, cuts it to k groups, shows associated features
Check ?view_tree()
for more details.