Ping a Specific Port

Question

Prashant Lakhera

Asked: 2015-03-11 11:50:22 +0800 CST2015-03-11 11:50:22 +0800 CST 2015-03-11 11:50:22 +0800 CST

elasticsearch-river-jdbc inserting duplicate record in mysql db

772

Sorry I am newbie to Elasticsearch and I am using elasticsearch-river-jdbc to connect to myql database,everything is working fine except for the fact that everytime it run as per schedule its inserting duplicate record.This is what I am using

curl -XPUT 'localhost:9200/_river/my_jdbc_river/_meta' -d '{
"type" : "jdbc",
"schedule" : "0 0-59 0-23 ? * *",
"jdbc" : {
    "url" : "jdbc:mysql://localhost:3306/test",
    "user" : "test",
    "password" : "test",
    "sql" : "select * from test"
    }
}'

I go through some docs where it's mentioned that we can run sql query select based on _id but my question is when we created this river only then this unique id is created and that is created on the Elasticsearch side so as per my understanding mysql has no knowledge about this.Please let me know if I am missing something

So if I am writing sql satement like this

 "sql" : "select id as _id,a1,a2 from test"

[2015-03-10 13:16:00,018][ERROR][river.jdbc.RiverPipeline ] com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Unknown column 'id' in 'field list'
java.io.IOException: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Unknown column 'id' in 'field list'

2 Answers

Voted

Prashant Lakhera · Answer 1 · 2015-03-12T10:18:58+08:00

Prashant Lakhera

2015-03-12T10:18:58+08:002015-03-12T10:18:58+08:00

Workaround for this issue is,I need to SELECT one of the fields as '_id' for it to work

 "sql" : "select *, revision as _id from test;"

Now the other issue is when its writing data back to ES the data and time format changed to UTC

for eg: 2015-03-11T00:00:00.000-07:00 and 1970-01-01T10:55:54.000-08:00

There is already thread related to this but no workaround

https://stackoverflow.com/questions/12969481/jprante-elasticsearch-jdbc-river-changing-the-date-value

1

Prashant Lakhera · Answer 2 · 2015-03-12T11:48:38+08:00

Prashant Lakhera

2015-03-12T11:48:38+08:002015-03-12T11:48:38+08:00

Solution for this issue is to use timezone in jdbc block

"timezone" : "TimeZone.getDefault()"

Also I am saving date and time in separate field in mysql DB

| date | date | YES | | NULL | |
| time | time | YES | | NULL | |

Elasticsearch uses Joda timeformat to save date. Hence it's automatically converting my date to datetime.

In the date field, since I don't have time, it is automatically adding zero's to it.

Since I need to display data via Kibana that why I need this split..I converted format of date and time as varchar(20) as a workaround(bad idea I know) and its working fine now ..

0

elasticsearch-river-jdbc inserting duplicate record in mysql db

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?