Closed
Description
Does this debug=TRUE
help you to understand what is the cause of the error execution?
> tagged.results <- treetag(c("run", "ran", "running"), treetagger="manual", format="obj",
+ TT.tknz=FALSE , lang="en",
+ debug = TRUE,
+ TT.options=list(path="TreeTagger", preset="en"))
split=[[:space:]]
ign.comp=-
heuristics=abbr
heur.fix=c("’", "'"), c("’", "'")
sentc.end=., !, ?, ;, :
detect=FALSE, FALSE
clean.raw=
perl=FALSE
stopwords=
stemmer=
Assuming 'UTF-8' as encoding for the input file. If the results turn out to be erroneous, check the file for invalid characters, e.g. em.dashes or fancy quotes, and/or consider setting 'encoding' manually.
TT.tokenizer: koRpus::tokenize()
tempfile: C:\Users\Marcin\AppData\Local\Temp\Rtmp2PQ5Ts\tokenizef94305e2e24.txt
file: C:\Users\Marcin\AppData\Local\Temp\Rtmp2PQ5Ts\tempTextFromObjectf942ee2415d.txt
TT.lookup.command:
TT.pre.tagger:
TT.tagger: TreeTagger/bin/tree-tagger.exe
TT.opts: -token -lemma -sgml -pt-with-lemma -quiet
TT.params: TreeTagger/lib/english-utf8.par
TT.filter.command: | perl -pe 's/\tV[BDHV]/\tVB/;s/IN\/that/\tIN/;'
sys.tt.call: type C:\Users\Marcin\AppData\Local\Temp\Rtmp2PQ5Ts\tokenizef94305e2e24.txt | TreeTagger/bin/tree-tagger.exe TreeTagger/lib/english-utf8.par -token -lemma -sgml -pt-with-lemma -quiet | perl -pe 's/\tV[BDHV]/\tVB/;s/IN\/that/\tIN/;'
Error in matrix(unlist(strsplit(tagged.text, "\t")), ncol = 3, byrow = TRUE, :
'data' must be of a vector type, was 'NULL'
In addition: Warning message:
running command 'C:\Windows\system32\cmd.exe /c type C:\Users\Marcin\AppData\Local\Temp\Rtmp2PQ5Ts\tokenizef94305e2e24.txt | TreeTagger\bin\tree-tagger.exe TreeTagger\lib\english-utf8.par -token -lemma -sgml -pt-with-lemma -quiet | perl -pe 's\\tV[BDHV]\\tVB\;s\IN\\that\\tIN\;'' had status 9
> sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
locale:
[1] LC_COLLATE=Polish_Poland.1250 LC_CTYPE=Polish_Poland.1250 LC_MONETARY=Polish_Poland.1250 LC_NUMERIC=C
[5] LC_TIME=Polish_Poland.1250
attached base packages:
[1] grid stats graphics grDevices utils datasets methods base
other attached packages:
[1] koRpus_0.10-2 data.table_1.9.6 gradientr_0.0.1 RWeka_0.4-33 tm_0.7-1 NLP_0.1-10
[7] stringi_1.1.5 NbClust_3.0 cluster_2.0.5 factoextra_1.0.4 foreach_1.4.3 openxlsx_3.0.0
[13] networkD3_0.3 VennDiagram_1.6.17 futile.logger_1.4.3 Boruta_5.2.0 ranger_0.6.0 scales_0.4.1
[19] ggmosaic_0.1.2 productplots_0.1.1 corrplot_0.77 stringr_1.2.0 magrittr_1.5 dplyr_0.5.0
[25] purrr_0.2.2 readr_1.0.0 tidyr_0.6.1 tibble_1.2 tidyverse_1.0.0 readxl_0.1.1
[31] haven_1.0.0 plyr_1.8.4 tables_0.8 Hmisc_4.0-2 ggplot2_2.2.1 Formula_1.2-1
[37] survival_2.40-1 lattice_0.20-34
loaded via a namespace (and not attached):
[1] devtools_1.12.0 RColorBrewer_1.1-2 httr_1.2.1 tools_3.3.1 backports_1.0.4 R6_2.2.0
[7] rpart_4.1-10 DBI_0.5-1 lazyeval_0.2.0 colorspace_1.3-1 nnet_7.3-12 withr_1.0.2
[13] sp_1.2-4 gridExtra_2.2.1 compiler_3.3.1 chron_2.3-47 htmlTable_1.9 flashClust_1.01-2
[19] plotly_4.5.6 labeling_0.3 slam_0.1-40 checkmate_1.8.2 digest_0.6.10 foreign_0.8-67
[25] ca_0.70 base64enc_0.1-3 jpeg_0.1-8 htmltools_0.3.5 maps_3.1.1 RWekajars_3.9.1-3
[31] FactoMineR_1.35 htmlwidgets_0.8 jsonlite_1.1 acepack_1.4.1 wordcloud_2.5 leaps_3.0
[37] geosphere_1.5-5 Matrix_1.2-7.1 Rcpp_0.12.8 munsell_0.4.3 proto_1.0.0 scatterplot3d_0.3-39
[43] MASS_7.3-45 parallel_3.3.1 ggrepel_0.6.5 splines_3.3.1 mapproj_1.2-4 knitr_1.15
[49] igraph_1.0.1 rjson_0.2.15 reshape2_1.4.2 codetools_0.2-15 futile.options_1.0.0 kohonen_3.0.2
[55] latticeExtra_0.6-28 lambda.r_1.1.9 spam_1.4-0 png_0.1-7 RgoogleMaps_1.4.1 gtable_0.2.0
[61] assertthat_0.1 viridisLite_0.1.3 rJava_0.9-8 iterators_1.0.8 memoise_1.0.0 fields_8.10
[67] ggmap_2.6.1
Activity
unDocUMeantIt commentedon Apr 14, 2017
the
path
to TreeTagger is wrong. try an absolute path beginning with the drive letter.MarcinKosinski commentedon Apr 14, 2017
Hello, thanks for the fast reply. I have the
TreeTagger
both in the repository in the path I am currently working and in theC:/
directoryFor the absolute path the results is the same
I have installed PERL and downloaded the
english-utf8.par
file (that is included in thelib/
directory.unDocUMeantIt commentedon Apr 14, 2017
i see (and i was wondering why
treetag()
didn't complain about missing files, but of course it won't if they're not missing...).what's inside the
tagged.results
object? if all goes well, it's a matrix with three columns.can you open a command line and execute the full line after
sys.tt.call:
, beginning withtype
? what does it return? thewas "NULL"
error usually occurs if TreeTagger doesn't return what koRpus is expecting, which is a character vector with tab separation (should look like three columns in the terminal).[the command does work on my linux machine. but apart from your actual issue, it seems
TT.tknz=FALSE
seems to cut off the last character of the input vector -- i need to investigate this.]MarcinKosinski commentedon Apr 14, 2017
As for standard
R
execution that finishes with error the final object is not assigned. I get the message that theError: object 'tagged.results' not found
.MarcinKosinski commentedon Apr 14, 2017
I can not run anything that is after
as this requires some temporary files (that I do not longer have) and that are probably made out of the source vector
c("run", "ran", "running")
MarcinKosinski commentedon Apr 14, 2017
But it looks like the regular TreeTagger (not invoked from R) works properly (even though I didn't specify the final file to be lemmatized)
unDocUMeantIt commentedon Apr 14, 2017
debug=TRUE
should actually keep the temp files as long as the R session is running. did you close your session in the meantime?MarcinKosinski commentedon Apr 14, 2017
I didn't. Maybe it does not keep them when the error appears? I am lemmatizing from the command line anyway :)
unDocUMeantIt commentedon Apr 14, 2017
no, tempfiles should be kept, at least i'm sure they were in the past, because that's the method that we've been debugging these issues for a long time.
which brings me to the hypothesis that maybe generating the tempfile doesn't work for you in the first place? if the file can't be written, for whatever reason, then no tagging could be done.
have you successfully used koRpus earlier? just to see if this is something that way introduced with the last release.
unDocUMeantIt commentedon Apr 14, 2017
have you checked that perl is in your path on the command line? even if TreeTagger works, the following perl filter might break the full call. this should also cause an error if you try to use TreeTagger's tokenizer or the batch scripts that TreeTagger is usually run with.
MarcinKosinski commentedon Apr 14, 2017
@unDocUMeantIt it was the issue of
not being able to create a temporary file
below is the example of another character string for which the
treetag
worksMaybe one should add an info if the
temp
file couldn't be created?MarcinKosinski commentedon Apr 14, 2017
The PERL adds itself to the PATH during the installation.
I did succeed with the
tokniezer()
R function previously.Thanks for answering and for your previous time!
unDocUMeantIt commentedon Apr 14, 2017
now, that's odd -- looks like the tempfile is not created only when you use the
type="obj"
option, because there is successful tempfile creation in your second example. i'll leave this open until i have a clue what's (not) happening there.unDocUMeantIt commentedon Apr 14, 2017
i've looked at the
treetag()
code but so far have no clue what could cause this. it doesn't seem to happen on GNU/linux, but that doesn't explain it. it is unliekly that tempfiles are missing, becausetreetag()
checks for their existance.unDocUMeantIt commentedon Apr 17, 2017
i've installed
koRpus
in a windows 10 VM and can replicate the problem.it seems to be caused by inconsistencies between
file.path()
andshell()
, something which used to work for years but now appears to be broken. tryversus the explicit
19 remaining items
unDocUMeantIt commentedon May 8, 2017
@jmlehrfeld ah, now i see: your call is incomplete because you only defined the path to the *.exe file but nothing else. please try again with these settings instead:
does at least one of those work?
jmlehrfeld commentedon May 8, 2017
I think so! I set my env as you specified, called the treetag function (without the
debug
argument), and got no warning or error messages back. I guess I'm all set then. Thanks so much!trinker commentedon May 10, 2017
I have tested this using kkoRpus ‘0.10.2’ on a Win 7 machine running R 3.4.0 and 3.3.1 and no error. I have Win 10 @ work i'll try tomorrow. If path normalization is the issue the
normalizePath
command is nice:trinker commentedon May 10, 2017
I see I didn't read the last comments here and was late to the party :-)
unDocUMeantIt commentedon Jun 20, 2017
seems to be resolved for the moment.
JingwenRobineau commentedon Jun 22, 2017
I had the same problem. Nothing above worked for me. Finally, I solved the problem by updating the version of R.
eyyarbasi commentedon Jul 12, 2019
I have a similar problem, tried the aforementioned methods but I wasn't able to solve it. When I try to run the following code in Rstudio, I get the following error.
However, on the command prompt, the same thing works. I just can't get it on Rstudio. Any ideas why this might be happening? Btw, I'm relatively new in these stuff so I'm sorry if I'm missing something obvious :)
I think I have all the necessary files in the working directory since I can get some results on the cmd. To me, it seems like everything is working but just not on the platform that I want to use. Thanks a lot!
unDocUMeantIt commentedon Jul 13, 2019
@eyyarbasi:
could you please provide some more information on your system setup?
R
andkoRpus
are you using?just a shot in the dark: can you try to start a plain R session (without RStudio) and run the your R code from there? i would like to check if this issue is somehow related to the environment set up by RStudio (i don't use RStudio, it's all RKWard here ;)).
unDocUMeantIt commentedon Jul 13, 2019
@eyyarbasi does the
lemma_tagged
object that you tried to create hold any data at all?eyyarbasi commentedon Jul 13, 2019
Thanks for the reply! RStudio is v1.2.1335 and R is 3.6.0.
Here's my
sessionInfo()
And your intuition was right! It's an issue with RStudio. the object
lemma_tagged
doesn't even get created in RStudio but the code works as a simple R script without RStudio. Somehowtreetag()
freaks out in RStudio. Open for futher suggestions. Thanks again!unDocUMeantIt commentedon Jul 13, 2019
that's interesting -- and a bit puzzling...
during a workshop i gave recently one windows user ran into a problem with access permissions. i.e., his code would only run if he started RStudio with admin rights. IIRC, the application was unable to run the TreeTagger executable otherwise. running userland software as admin is not a solution, but if you could at least check once if this makes the problem go way, i'd get a clue where the actual issue lies.
one other hypothesis i have is RStudio's handling of
system()
/shell()
calls. its terminal implementation seems to offer to run a windows version of bash, and i wonder if that could also be the case forshell()
calls, because it would render all file paths useless. so it would probably be interesting to have a look at the return values ofshell()
for the command you successfully ran incmd.exe
. this call seems to fail in RStudio (but not in plain R). if it does, you should try to run it in small units to see at which point in the call chain it actually fails, likeupdate the temporary file, of course ;) this should tell us if it already fails accessing the text file, running
TreeTagger.exe
orperl
.XueWenSYan commentedon Feb 4, 2022
Hi, I encountered the same error as eyyarbasi, and I'm also using windows. I tried running the code in base R gui and with administrative privileges but the error persists. I similarly could run treetag from command line. Has there been a solution now? Thank you!
unDocUMeantIt commentedon Feb 5, 2022
in that case it is probably not the same issue. since this issue is already closed, could you please open a new one including info on your system setup (installed software packages with version numbers) and example code to reproduce the error?
thank you!