Subscribe to DSC Newsletter

Is there any software that converts an SQL script into NoSQL? What are the drawbacks and limitations  of these translators? What about SQL to Python?

By the way I'm interested in many different programming language translators including C++ or Perl to Python or Java - if you know any great tool, feel free to share. Writing a translator would be an exciting project, if none currently exists. 

Related articles

Views: 25067

Reply to This

Replies to This Discussion

Vincent,

About ten years ago, I started work on a C/C++ text parsing/translating application used for modernizating legacy computer languages. It has a web interface as well as a batch processor for large data volumes. 

The product has two main aspects, one for up-front analysis work and another for automated translations.

The first part scans for keywords and stores the frequency of hits into a database. In addition to scanning for keywords, it also parses out company-specific information such as table and column names for later redundancy analysis. With the resulting scanned inventory database, we can do reporting with a BI product and R for word clouds and some statistical analysis. For example, the analysis can automatically establish the purpose, complexity, and modernization approach for each file.   

The second component of the application is to parse the legacy reporting language, put its conceptual instructions into an abstract model, and then generate a modern BI replica of the original. 

I do not sell the software product; instead I use it as part of BI consulting engagements based on a methodology that I call DAPPER (Discovery, Analysis, Project Plan, Pilot Project, Execute Phased Project, and Retire Legacy Application). I have done this type of engagement for many well-known companies (see my LinkedIn profile projects). 

The bottom line is that I have a high-speed SQL parser written in C/C++, but would need to create the NoSQL generator. If you think there is a market for a SQL-to-NoSQL translator, contact me at [email protected] It could definitely be a fun project. 

Doug,

I share a similar interest - I believe there could be a huge market for this.  :)  So, I have written a SQL to MongoDB UI / translator.  Relatively sophisticated and robust - including support for aggregation via either aggregation framework or map reduce.  Check it out if it sounds interesting.  I'm enhancing it constantly.  Grab the latest version on GitHub (currently the latest version is 1.5.006.). Just click on the .zip for in the folder for the latest version is and then click the "Raw" button to download the .zip.

https://github.com/mongosql/releases

-Keith Schnable



Doug Lautzenheiser said:

Vincent,

About ten years ago, I started work on a C/C++ text parsing/translating application used for modernizating legacy computer languages. It has a web interface as well as a batch processor for large data volumes. 

The product has two main aspects, one for up-front analysis work and another for automated translations.

The first part scans for keywords and stores the frequency of hits into a database. In addition to scanning for keywords, it also parses out company-specific information such as table and column names for later redundancy analysis. With the resulting scanned inventory database, we can do reporting with a BI product and R for word clouds and some statistical analysis. For example, the analysis can automatically establish the purpose, complexity, and modernization approach for each file.   

The second component of the application is to parse the legacy reporting language, put its conceptual instructions into an abstract model, and then generate a modern BI replica of the original. 

I do not sell the software product; instead I use it as part of BI consulting engagements based on a methodology that I call DAPPER (Discovery, Analysis, Project Plan, Pilot Project, Execute Phased Project, and Retire Legacy Application). I have done this type of engagement for many well-known companies (see my LinkedIn profile projects). 

The bottom line is that I have a high-speed SQL parser written in C/C++, but would need to create the NoSQL generator. If you think there is a market for a SQL-to-NoSQL translator, contact me at [email protected] It could definitely be a fun project. 

While not directly SQL->NoSQL translators, the set of connectors that my company offers does translation of SQL to either the underlying NoSQL language (HiveQL, BigQuerySQL, etc) or maps SQL to the correct API calls. The connectors are generally ODBC drivers that can be used in any typical BI application, or custom application, so you can use SQL with data sources such as MongoDB, Hive, Cassandra, HBase and more.

While NoSQL generally eschews the principles of SQL, it’s extremely useful to be able to analyze or access your data through traditional means, or in a general way without creating custom code for each data source. You can try out any of the drivers here: http://www.simba.com/connectors

/div> ကျွန်ုပ် ရေးမည်။

we are coming up with a tool that migrates the data from SQl to NOSQL and this can be done with ease. The beauty of the tool is that it offers the flexibility so that the user can select the tables that has to be migrated or the the entire DB has to migrated. The tool as of now supports the migration tow that of the MONGO DB (NOSQL) The road map has to support all the NOSQL DB's like Cassandra, Elastic Search, Couch Db , Foundation Db. As of now we started with MongoDb as this widely used across many organizations. If interested we can share the tool so that you can evaluate and let us know the improvement areas, will be sharing all the details in a weeks time once the announcement is made by the organization.

I have tried to use translators.  However, I have found that none of them are near to 100% accurate.  You still need to understand both languages.  If you don’t, you will have a hard time identifying where the logic is flawed. 

@Vincent you mentioned that "you are trying to get SQL to work much faster, and integrate legacy SQL code into NoSQL environments". Which NoSQL environment in particular, document based, property graph, big tables, RDF triple stores ? In my opinion the problem is that you attempt to map the most popular relational standard query language to many other query languages that server different data models ! Coincidentally my work with R3DM/S3DM can be applied on the unification of data models. In fact this is indeed the approach of W3C consortium with the RDF/OWL standard, therefore the obvious answer to your question would have been to drive your code towards that standard. But I am challenging this as the most efficient data model to cope with integration, interoperability, usability issues and I think I am not the only one. I will be happy to discuss with your audience associative, hypergraph, SEMIOTIC data modelling architecture with the R3DM/S3DM framework. 

+1 HPCC Systems...used to work for them. GREAT piece of tech that is now open source!

Arjuna Chala said:

HPCC Systems (http://hpccsystems.com) JDBC driver does exactly this. It accepts SQL and converts it ECL the HPCC Systems data programming language. The open source code base is available here - https://github.com/hpcc-systems/hpcc-jdbc

 

Vincent...I've been complaining to vendors for years about this problem. It's only going to get bigger. Whomever solves it will make a small fortune I suspect (from large corporations who have a stable of "classically trained" SQL analysts and too much data for a structured data store.)

The engineers at CData Software (my employer) have largely solved this problem. We have what are essentially SQL wrappers around APIs/protocols. As database/API experts, we've built the drivers to translate cleanly from SQL to NoSQL, pushing down as much SQL functionality to the underlying service/server as possible.

If the NoSQL data is semi-structured (like a JSON document) we build the SQL interface on top through a variety of "flattening" techniques. I'll include some links to articles and a white paper bottom for those interested in more information.

For example, if we had the following JSON document (forgive the formatting):

{
"people": [
{
"age": 20,
"gender": "F",
"name": {
"first": "Jane",
"last": "Doe"
},
"vehicles": [
{
"type": "car",
"make": "Honda",
"model": "Civic",
"year": 2015
},
{
"type": "truck",
"make": "Dodge",
"model": "Ram",
"year": 2010
}
]
},
...
]
}

We could extract a people "table" with a set of columns, each typed based on either a row-scan of the data in the document store or based on user specifications.

  • age
  • gender
  • name.first
  • name.last
  • vehicles.0.type
  • vehicles.0.make
  • vehicles.0.model
  • vehicles.0.year
  • vehicles.1.type
  • vehicles.1.make
  • vehicles.1.model
  • vehicles.1.year

Alternatively, we could present the vehicles collection(s) as a separate table or even simply return each nested structure as raw JSON.

The configuration settings allow for granular control over how the data is interpreted to make sure users can get to NoSQL data in a way that is meaningful and useful.

The drivers are compatible with any tool, application or language that supports the database-like connecitivty we offer (JDBC, ODBC, ADO.NET, Excel, etc).

Related articles:

Reply to Discussion

RSS

Follow Us

Videos

  • Add Videos
  • View All

Resources

© 2018   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service