More

zardosht · on July 5, 2014

Another thing to keep in mind is that not all systems are meant to be CP systems.

zardosht · on July 5, 2014

Another engineer at Tokutek here. As you see, we are up to 2.4, and have been investigating 2.6 and Geo. With all possible features, whether they be from MongoDB 2.6 or things we innovate on our own like partitioned collections, we prioritize and address them based on customer and user feedback.

Also, 2.6 is not an all or nothing proposition that needs to be done in one release. Features with the most demand (whether it be the new write commands or aggregation framework improvements) will be done before others. We've done this before. When we released 1.0 that was based on 2.2, we also released hash based sharding with it which was a 2.4 feature. We did so because users demanded it.

As for pushing bug fixes upstream, we file bugs when we see them. Our VP of engineering was a winner in the MongoDB 2.6 bug hunt with SERVER-12878. SERVER-9848 and SERVER-14382 are among the bugs I've filed.

nevi-me · on July 5, 2014

Thanks for the response, I read a post on the mongo-user group , and that's what I noticed, that a number of features are ported as and when necessary. Don't read what I say in a very negative sense, because I'm mostly curious, and it's my opinion that sometimes the little that we (I) get exposed to regarding TokuMX specifically is that it's superior to Mongo, that it's a "choose us or lose out" thing, but that happens when one doesn't follow a certain topic, but only sees it being mentioned here and there (understandable since Mongo has been the subject of "my start-up failed, and I blame it on Mongo; so burn Mongo" kind of discussions).

One more question if you don't mind: since MongoDB will support various storage engines from 2.8, including Tokutek's storage engine (can't remember its name); notwithstanding other innovations on TokuMX, would switching from mmap to Tokutek's storage engine mean that one ends up with Mongo having geo-indices and other bells, while having TokuMX's main feature?

zardosht · on July 5, 2014

Your last question is a bit loaded with a bunch of "ifs", so let's unwind it. I don't know what MongoDB will "support" as far as other engines go. But assuming we, Tokutek, release something that we support that is our engine plugged into 2.8 using MongoDB's storage engine plugin, then according to the design we heard about at MongoDBWorld, that product will be what you think it is: Mongo with geo and "other bells", and TokuMX's compression + write performance.

But 2.8 is a bit away and the storage engine API is a very fresh development. I don't think anyone is in a position to be able to really guarantee what it would look like and how TokuFT (https://github.com/Tokutek/ft-index/) will plug into it. I definitely cannot make any promises.

If you are interested in TokuMX + some missing features from MongoDB (sounds like geo), and don't mind discussing your needs and use cases with our sales guys, please give us feedback at http://www.tokutek.com/contact/. As I mentioned previously, user feedback drives what we do, so at the very least, you can provide some additional data points.

ericingram · on July 5, 2014

We didn't need GEO indexing but what Toku does offer is pretty exciting. Primary wins for us include multi-query transactions, compression, fractal tree indexes (thus overall insert and query performance), and clustering indexes.

zardosht · on April 8, 2014

Not for single-server performance. The database level lock severely limits MongoDB's single server performance. Just look up the sysbench benchmark comparing MongoDB with TokuMX (which I work on)

zardosht · on April 8, 2014

TokuMX, which I work on, has document level locking and compression right now.

int64 · on April 8, 2014

trying amisaserver found it somewhere in the comments below. claims to have MVCC.

int64 · on April 8, 2014

sweet then, will give tokumx a shot as well. Thanks

zardosht · on April 8, 2014

TokuMX does have MVCC

zardosht · on Feb 14, 2014

(I work for Tokutek)

Write concurrecy: yes, TokuMX does not have a database level reader/writer lock.

Index Building: yes, fractal trees can write data much more efficiently, so if index building is a problem, I bet TokuMX solves it.

Practically reducing file size: to be honest, I am not sure because due to our great compression, this has not been a general issue for our users. Our reindex command could reduce file size, but I cannot point to examples.

One of our big goals is to address storage issues MongoDB has.

zardosht · on Feb 13, 2014

ddorian, Can you elaborate what that means?

ddorian43 · on Feb 13, 2014

What i mean, every node is the same, no mongos , you just connect to one random mongod and it handles the mongos funcionality.

So if you grow, you add 1 node, not a replica-set(that could be 3 nodes if you have 3x replication)

leif · on Feb 13, 2014

Unfortunately, this would break compatibility with existing MongoDB applications more than we would probably be willing to do. However, there's no reason RethinkDB couldn't use Fractal Tree indexing instead of B-trees, given some engineering effort.

ddorian43 · on Feb 13, 2014

but rethinkdb doesn't have range-sharding (they had it but they are/did change it to random(id), also no sharding(custom_field))

zardosht · on Feb 11, 2014

Roger,

I work at Tokutek (and wrote the post above). I'm sorry you ran into issues trying out TokuMX. I assure you, we are "ready", as we have users running in production.

Nevertheless, you ran into problems and that is unfortunate. If you have details, can you please share them with the tokumx-user google group? We might be able to help. I suspect the transition to using a transactional system like TokuMX where entire statements are transactional is resulting in some "gotchas", but that is just an educated guess.

-Zardosht

rogerbinns · on Feb 11, 2014

I mean ready in the sense that pointing code that worked flawlessly against MongoDB to TokuMX then just works flawlessly too.

I uninstalled Toku and went back to MongoDB so I can't provide any further testing. (The mongorestore takes days.)

I can tell you want code was running at the time. It reads events sorted by user id and timestamp, and then discovers session boundaries in that. A new session object (in a different collection) is written out with all the events as a subdocument list. (In rarer cases an existing session object is updated.) This was happening in 8 separate processes all in Python/pymongo. There are no statements running that affect more than one document, nor any need for transactions.

leif · on Feb 11, 2014

If you were using upserts I expect you were having problems due to the optimizer retrying all possible plans (including table scan) periodically. This is reflected in https://github.com/Tokutek/mongo/issues/796 and is fixed in 1.4.0. If you'd like to try another evaluation, get in touch with us and we can help you track down whatever problems you see.

Not all mongodb code will optimally use tokumx without any changes. Concurrency is hard and mongodb encourages some patterns that are bad for any concurrent database. For example, count() for an entire collection is not, and could never be, as cheap in a concurrent database like tokumx as it is in mongodb.

rogerbinns · on Feb 11, 2014

Thanks for the offer, but the mongorestore times (against MongoDB) being over a week makes this too risky.

The code making changes was insert (mostly) with a few upserts, but the latter was by _id. My hypothesis as to the cause is that tokumx adds implicit transactions and then there are some arbitrary restrictions around those transactions (eg how many outstanding at once, timeouts in lock acquisition) and after a few hours one of those was hit. The error message was something about being unable to start a transaction.

> Not all mongodb code will optimally use tokumx without any changes

The goal wasn't to be optimal or anything like that. It was initially about space consumption (where you did really well) and verifying the same client code ran correctly. We have two setups so one would run toku and one mongodb and data processing results compared.

leif · on Feb 11, 2014

Ok. Well, you said you were waiting for it to be ready, and I think it is. We'll be here when you get a week free to tinker.

zardosht · on July 24, 2013

MongoDB 2.2 and 2.4

zardosht · on June 20, 2013

Just as in our TokuDB for MySQL product, we have zlib and lzma compression available.

zardosht · on June 20, 2013

Staying up to date is not an all-or-none proposition. We use feedback to drive direction. For example, even though we used 2.2 as a base, user feedback compelled us to include hash-based sharding, a 2.4 feature, in this release.

HN For You