What I learned after answering MongoDB questions on Stack Overflow for 20 days

Aug 24, 2022

For the last three months, I have been migrating some services from PostgreSQL to MongoDB. This migration involved writing a lot of MongoDB queries and making a lot of schema design decisions. Most days, I would be waist-deep in the MongoDB documentation by lunchtime. With no other side project in the pipeline, I decided to try answering some questions on StackOverflow. Having little experience as an “answerer” on the StackOverflow platform, I answered 20 MongoDB questions in 20 days, making it into the top 0.74% of all answerers on the platform in August. Here’s what I learned.

About StackOverflow

Getting into a community takes work.

Your reputation on StackOverflow directly indicates how many people you’ve helped. You can’t gain comments or followers by posting clickbait or hot takes. The only* means of gaining reputation points is by posting answers to questions based on programming or programming-adjacent topics.

Answering questions is hard.

The first challenge is to find a question that you can answer. Targeting a specific tag is the way to go. It is StackOverflow’s version of “finding your niche.” I hunted for questions that were based on MongoDB queries and features. This allowed me to answer multiple questions with little “context switching.” The second challenge is to be quick to post an answer. Easy questions get answered very quickly. An unanswered question may get multiple answers between the time you start and finish typing yours. I suspect this would be even more challenging if you’re hunting for questions in tags that are more active than MongoDB. The third challenge is that you cannot post a suggestion or a hint as an answer. Careless or incomplete answers get downvoted very quickly. You must replicate the asker’s environment, at least to some degree, and test the solution before posting. I found myself using Mongo playground a lot for testing queries.

The imposter syndrome kicks in early.

Most of my answers made me feel like I was acting as a proxy for the documentation. It didn’t feel like I was using my “knowledge” or “skill” to answer the question; I just knew the correct query operator because I had gone through more pages of MongoDB’s documentation. I’m still trying to figure out whether the imposter syndrome that ails most of us caused me to feel this or if the feelings were valid.

Not all questions deserve an answer.

A lot of questions remain unanswered because they are poorly framed or provide incomplete information. I often found myself tempted to post an answer based on an assumption and risk a downvote. You can ask for more details without posting an answer once you unlock the privilege of commenting (50 reputation points). But till then, it sucks. I learned the hard way that it’s best to let questions go than post an incomplete answer.

There is a genre of askers that completely alter the question by adding more details or “conditions” after it has received some answers. I despise them like nothing else.

About MongoDB

The biggest problem in using databases is connecting to them.

Questions on connection issues between applications and Mongo clusters crop up frequently. I’m curious if it is the same with every database, but Mongo doesn’t make it any easier. The usual suspects are Atlas’ IP whitelists and the “seed-list” connection string for replica sets. And if your replica set is self-managed, you may have incorrectly configured the servers and will likely waste time debugging the issue on the client side instead.

Is mongoose “ubiquitous”?

Mongo StackOverflow has accepted that it’s okay to assume that the questioner must be using (or familiar with) Mongoose if the code snippet in the question looks like JavaScript. I couldn’t care less about what people assume, but I am still on the fence about being pro-ORM or against, having been burnt in the past.

Beware of the false promises made by ORMs/ODMs; Do not be ensnared by the illusion of productivity. Their magic is dark, darker than 0x000000.

MongoDB released a lot of stuff in v5 and v6.

Another reminder that tech changes fast. It took more than a few hours to get up to speed with all the new features in Mongo 5 and 6. I still need to learn about time series collections and window functions.

1000 ways to mess up schema design

Caution: mini rant ahead

If you want to query a triply nested array of objects and merge it with some keys in a different collection, fine! I’ll write you a 150-line long aggregation query. But it’s not my problem that you can’t understand it. And don’t ask me why it won’t use your precious index.

Just because MongoDB will store almost anything you throw at it doesn’t mean you should. The first principle of schema design with MongoDB is “how you query your data determines how you store it. Questions of the type “how to query for conditions X, Y, and Z in a given collection” where very common. And while I could almost always craft a query or aggregation pipeline for these, post a quick answer, and get some points, I would never run them if it were my database. Some of the queries people asked for were so non-standard that I wondered if they had spent enough time on schema design. I’m pretty sure I know the reason for this:

People don’t learn about database systems.

You need to learn about databases, not “programming with X database in language Y,” not “building APIs with X language and Y database” which would skim over everything that’s important. Do a boring course on databases. Read one of those old books that talk about the fifth normal form or dirty reads. Read MongoDB’s documentation and HOW-TO guides.