1

I have a mongodb collection with a field that I'm trying to index. It's a string field titled 'title'. There's about 9 million different entries and I'm just trying to get rid of the junk ones.

When I tried indexing it with:

db.getCollection("review_metadata").createIndex({"title" : 1})

I get this error:

db.getCollection("review_metadata").createIndex({"title" : 1})

{
        "createdCollectionAutomatically" : false,
        "numIndexesBefore" : 2,
        "ok" : 0,
        "errmsg" : "Btree::insert: key too large to index, failing amazon_reviews.review_metadata.$title_1 1860 { : \"***Super Charger*** Ultra Slim 40W AC Power Adapter Cord for Samsung Notebook/UltraBook : NP300U1A, NP300U1A-A01US, NP305U1A, NP305U1A-A01US, NP305U1A...\" }",
        "code" : 17282
}

So, is there a way to search through all of the values in the title field for values that would be too large to index?

Zeratas
  • 1,005
  • 3
  • 17
  • 36
  • 2
    Possible duplicate of: http://stackoverflow.com/questions/29577713/string-field-value-length-in-mongodb – Ian Mercer May 09 '16 at 04:51
  • 1
    Agreed, just use a minimum length of 1012 to find the offending docs. `db.test.find({title: /^.{1012,}$/})`. – JohnnyHK May 09 '16 at 13:37

1 Answers1

1

According to manaul there is 1024 bytes limit in indexed field. As you are going to index text field - text index could be a good solution

db.review_metadata.createIndex(
   {
     title: "text",
     otherFieldThatCouldBeIndexedToo: "text"
   }
 )
profesor79
  • 9,213
  • 3
  • 31
  • 52
  • I went with this solution as I'm just testing something out and need to do easy /^**** search through the data. Thanks. – Zeratas May 09 '16 at 14:42