Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Execution halted with large queries #40

Open
noktilux opened this issue Feb 7, 2015 · 21 comments
Open

Execution halted with large queries #40

noktilux opened this issue Feb 7, 2015 · 21 comments

Comments

@noktilux
Copy link

noktilux commented Feb 7, 2015

Queries that run without any issues in version "0.9-3' now do not work in "0.10.1".

In one of my tests, I removed columns one by one and things worked properly when I got down to 203 columns. I thought perhaps there was a problem with the last one I removed, but that is not the case. If I remove another one instead, the same issue occurs. This is the message:

Error in validObject(.Object) : 
  invalid class “MySQLResult” object: invalid object for slot "Id" in class "MySQLResult": got class "list", should be or extend class "integer"
Calls: dbSendQuery ... .local -> new -> initialize -> initialize -> validObject
In addition: There were 50 or more warnings (use warnings() to see the first 50)
Execution halted

In another situation (involving a JOIN of 2 tables), if I list out all the columns (270 in total), the query works, but if I write it as SELECT table1., table2. FROM ... , then it fails. The message is:

Error in is(object, Cl) : 
  INTEGER() can only be applied to a 'integer', not a 'char'
Calls: fetch -> .valueClassTest -> is -> is -> .Call
Execution halted

To repeat, the queries are all valid and work perfectly in MySQL (issuing direct commands) and in the older version of this library (RMySQL 0.9-3) which I have installed on another machine.

I have placed a test schema for you here:
https://heartsys.org/share/foo.tar.gz

Simply run "SELECT * FROM foo.tbl" to reproduce issue #1.

P.S. This is the version of R I am running:
R version 3.1.2 (2014-10-31) -- "Pumpkin Helmet"
Copyright (C) 2014 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

@hadley
Copy link
Member

hadley commented Feb 17, 2015

Could you please try with the dev version?

@noktilux
Copy link
Author

hadley.
can you please provide me with a link to the dev version? i am only seeing for windows binaries (here : http://cran.r-project.org/web/packages/RMySQL/index.html) and i run GNU-linux.

@hadley
Copy link
Member

hadley commented Feb 19, 2015

@noktilux
Copy link
Author

thanks. i will give it a go in the next couple of days.

@MJFitzpatrick86
Copy link

Was a solution to this every figured out? (I am trying to solve a similar problem with the same error message.)

@krlmlr
Copy link
Member

krlmlr commented Oct 27, 2015

@noktilux: Any updates here?

@MJFitzpatrick86: Could you please post a reproducible example?

@realgithup
Copy link

I am facing the same problem. Trying to fetch data for thousand companies in a loop. After 30-40 companies it generates the above error. Even dev version has this issue. I get the following error

Error in validObject(.Object) :
invalid class "MySQLResult" object: invalid object for slot "Id" in class "MySQLResult": got class "character", should be or extend class "integer"

I had to use version 0.9.3. But that version sometimes generates "caught segfault" error.

@hadley
Copy link
Member

hadley commented Oct 30, 2015

@realgithup we can't help with a reproducible example

@makoshark
Copy link

This is a real bug although it does seem to be one of those nasty "sometimes" errors. I got this error running an R script and then re-ran the same script on the same machine, absolutely no changes, no restarting MySQL, and it worked just fine.

@cturbelin
Copy link

Hi,
This is a very strange situation. I have exactly the same error and it's very hard to reproduce.
If I run the exact same query outside my script, it runs without any error and I can't get why.
So I'm not able to create a test case.

I dont know if this can help : if I add the line "dbColumnInfo(con, table) just before the call to the dbGetQuery call, the query is executed without any error (with table=the table used in dbGetQuery).

SessionInfo():
R version 3.2.1 (2015-06-18)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux Server release 6.3 (Santiago)

locale:
[1] LC_CTYPE=fr_FR.UTF-8 LC_NUMERIC=C
[3] LC_TIME=fr_FR.UTF-8 LC_COLLATE=fr_FR.UTF-8
[5] LC_MONETARY=fr_FR.UTF-8 LC_MESSAGES=fr_FR.UTF-8
[7] LC_PAPER=fr_FR.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] methods stats graphics grDevices utils datasets base

other attached packages:
[1] RColorBrewer_1.1-2 ggplot2_2.0.0 RMySQL_0.10.8 DBI_0.3.1
[5] Cairo_1.5-9

Best.

@krlmlr
Copy link
Member

krlmlr commented Feb 29, 2016

@cturbelin: Thanks. The R code and a dump of the database schema could help narrow down this issue.

@vst
Copy link

vst commented Mar 27, 2016

@cturbelin: Confirmed that your dbColumnInfo hack worked in my case, too.

I have been seeing this error on Mac OSX once in a while when I was running repetitive big queries.

R> sessionInfo()
R version 3.2.3 (2015-12-10)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.11.3 (El Capitan)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] devtools_1.10.0

loaded via a namespace (and not attached):
[1] tools_3.2.3   memoise_1.0.0 digest_0.6.9

But, when I moved to a Docker container running Debian 8.3, one of those repetitive big queries never worked. Same database server...

> sessionInfo()
R version 3.2.4 Revised (2016-03-16 r70336)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 8 (jessie)

locale:
[1] C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

When I called the dbColumnInfo once before the mentioned series of queries (I mean only once), it managed to get through without any problems.

PS: The R routine is basically iterating over a stock symbols vector of length ~8000, and issuing an SQL query to retrieve their OHLC series from the database. Simple query, simple DB structure.

@vst
Copy link

vst commented Apr 12, 2016

OK, I have experienced another problem which looks like related to this.

As mentioned in the my previous comment, we have been calling the DBI::dbColumnInfo before attempting to read from the database table. Snippet from the R routine:

[...]

## Blind hack:
print(DBI::dbColumnInfo(xdbi::getConnection(), "table1"))
print(DBI::dbColumnInfo(xdbi::getConnection(), "table2"))

## Read from table1 and table2 separately:
[...]

Yesterday, our system has stopped in the middle of the day with this error message, right after printing the output of the first DBI::dbColumnInfo output:

Error in .local(conn, statement, ...) :
  connection with pending rows, close resultSet before continuing
Calls: print ... .local -> dbSendQuery -> dbSendQuery -> .local -> .Call
Execution halted

(Why does the call trace state dbSendQuery twice, right?)

Now... I know that we are running state machines and such errors must be reproducable. However, same Docker image running on local machine worked fine with the same database dump.

Going back to the server experiencing the problem, I've added a print statement for debugging purposes and it worked everytime when the script is run:

[...]

## Blind hack:
print("What is going on?")
print(DBI::dbColumnInfo(xdbi::getConnection(), "table1"))
print(DBI::dbColumnInfo(xdbi::getConnection(), "table2"))

## Read from table1 and table2 separately:
[...]

(Yep, I am surprised, too.)

I have quickly checked the C code for a possible problem in result set management, but couldn't find any. It looks short, precise and perfectly fine.

Then guess what: I have removed the blind hack entirely (including the print statement), and it never gave me the error message which it used to produce 2 weeks ago.

I am entirely clueless and don't want to waste package contributors' time... Could you address some functions in R and/or C/++ code which I can play with and see if I can produce some meaningful debugging information?

bollig added a commit to bollig/tso_cnv that referenced this issue Jun 18, 2016
Hack to try to address this bug: r-dbi/RMySQL#40
@MrDAndersen
Copy link

I am running into a similar issue. When running a sequence of queries using DBI::dbGetQuery, I intermittenly get the Error in validObject(.Object) : invalid class “MySQLResult” object: invalid object for slot "Id" in class "MySQLResult": got class "list", should be or extend class "integer" error. It seems most likely it is the second of the queries that is failing, but not all the time.

@eddy85br
Copy link

Impressively, here the same "blind hack" worked.

I'm doing a big query with 5 left joins.
It only worked by doing dbColumnInfo(con, "table_name") in all 6 tables used in the large select query, like this:

[ ... ]
    exec = dbExecute(con, paste('use sample_', sample_number, ";", sep=""))
    print(dbColumnInfo(con, "table1"))
    dbColumnInfo(con, "table2")
    dbColumnInfo(con, "table3")
    dbColumnInfo(con, "table4")
    dbColumnInfo(con, "table5")
    dbColumnInfo(con, "table6")
    if (exec == 0) {
        if (sample_number == "00") {
            df_samples = dbGetQuery(con, large_query_with_joins)
            [ ... ]
        }
    }
[ ... ]

@cocinerox
Copy link

Similar issue. Same workaround solved it.

@renkun-ken
Copy link

renkun-ken commented May 2, 2017

Executing via Rscript, I tried dbColumnInfo but this ends up with the following error:

Loading required package: methods
Error in .local(dbObj, ...) : 
  internal error in RS_DBI_getConnection: corrupt connection handle
Calls: dbColumnInfo ... dbClearResult -> dbIsValid -> dbIsValid -> .local -> .Call
In addition: There were 50 or more warnings (use warnings() to see the first 50)
Execution halted

In R terminal, the error is

Error in validObject(.Object) : 
  invalid class “MySQLResult” object: invalid object for slot "Id" in class "MySQLResult": got class "character", should be or extend class "integer"

or

Loading required package: methods
Error in dbFetch(rs, n = -1, ...) : 
  INTEGER() can only be applied to a 'integer', not a 'char'
Calls: dbGetQuery -> dbGetQuery -> dbFetch -> dbFetch -> .Call
In addition: There were 50 or more warnings (use warnings() to see the first 50)
Error in .local(dbObj, ...) : 
  INTEGER() can only be applied to a 'integer', not a 'char'
Calls: dbGetQuery ... dbClearResult -> dbIsValid -> dbIsValid -> .local -> .Call
Execution halted

@renkun-ken
Copy link

Through some experiments, I find that when select * or select A, B, C, ... where * or the specified columns represent more than 144 DECIMAL columns, the error will occur.

@renkun-ken
Copy link

When I adjust MySQL() to MySQL(fetch.default.rec = 10000) then the error is magically gone.

@thornad
Copy link

thornad commented May 11, 2017

@renkun-ken The "adjust MySQL() to MySQL(fetch.default.rec = 10000) then the error is magically gone"
worked for me too. Thanks for finding a solution for what the RMySQL developers should have fixed long ago.

@renkun-ken
Copy link

renkun-ken commented Jun 13, 2017

MySQL(fetch.default.rec = 10000) only works in RStudio, not in R terminal. The same code works fine in RStudio but crashes every time in R terminal. When the script is added to crontab, it crashes every time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests