Work with MongoDB and Python - Connect, Manage and Store Your Data!
Python is a powerful language and MongoDB is a powerful NoSQL database. Together, they can be used to solve complex data problems with ease. In this post, we’ll show you how to install PyMongo and Pandas, and then use them to create a logging class and a MongoDB class. With these two classes, you can easily connect to MongoDB, create collections, insert data, update data, and delete data.
Why Work with MongoDB and Python?
MongoDB is a NoSQL database that is designed to handle large amounts of unstructured data. It uses a flexible schema, which allows you to store data in a variety of formats. This makes MongoDB a great choice for data-rich applications. Python, on the other hand, is a high-level language that is easy to learn and use. It has a large community of developers, who have created a wealth of libraries and tools that can be used to solve complex data problems. By working with MongoDB and Python, you can leverage the strengths of both technologies to build robust and scalable data applications.
How to Install PyMongo and Pandas
Before you can start working with MongoDB and Python, you need to install PyMongo and Pandas. PyMongo is the Python library for MongoDB, and Pandas is a powerful data analysis library. You can install them both with pip:
pip install pymongo
pip install pandas
The Logging Class
One of the first things you’ll need to do when working with MongoDB and Python is create a logging class. This class will allow you to log all of your database operations to a file, so you can easily keep track of what’s going on. Here’s what the logging class looks like:
from datetime import datetime # importing DateTime package
class AppLogger:
"""
It is used save logs into a file
Parameters
----------
file: log file name Default is logfile.log
"""
def __init__(self, file: str="logfile.log"):
self.f_name = file
def log(self, log_type, log_msg):
"""Function log to save logs and log type in file
Parameters
----------
log_type: Type of log-info, error, warning etc.
log_msg: Log to be saved(message)
"""
# current time
now = datetime.now()
current_time = now.strftime("%d-%m-%Y %H:%M:%S")
# opening file in append + mode
with open(self.f_name, "a+") as f:
f.write(current_time + "," + log_type + "," + log_msg + "\n")
The MongoDB Class
Once you’ve created the logging class, you can start working on the MongoDB class. This class will allow you to connect to MongoDB, create collections, insert data, update data, and delete data. Here’s what the MongoDB class looks like:
import pymongo
import pandas as pd
from typing import (
Dict,
List,
Union
)
class MongoDB:
"""
mongodb class through which we can perform most of the mongodb tasks usin
Parameters
----------
connection_url: connection url with password
db_name: db name
"""
def __init__(self, connection_url: str, db_name: str):
# Establish a connection with mongoDB
self.client = pymongo.MongoClient(connection_url)
# Create a DB
self.db = self.client[db_name]
self.logger = AppLogger("mongodb_logs.txt") # creating App_Logger o
self.logger.log("info", "mongodb object created") # logging
def create_collection(self, COLLECTION_NAME: str):
"""
Function create_ table is used to create a new table
Parameters
----------
COLLECTION_NAME: collection name
"""
try:
self.db[COLLECTION_NAME]
self.logger.log("info", f"{COLLECTION_NAME} collection created"
except Exception as e:
self.logger.log("error", f"collectionqw not created error : {str(e)}")
def insert(self, collection_name: str, record: Union[Dict, List]):
"""
Function insert is used to insert value in table
Parameters
----------
record: data to be inserted as dict, to insert many data use list of dict
"""
try:
if isinstance(record, dict):
collection = self.db[collection_name]
collection.insert_one(record)
elif isinstance(record, list):
collection = self.db[collection_name]
collection.insert_many(record)
self.logger.log("info", f"inserted successfully") # logging
except Exception as e:
self.logger.log("error", f"insert error : {str(e)}") # logging
def update(self, collection_name: str, set_dict: Dict[Any, Any], where_dict: Dict[Any, Any]):
"""
Function delete is used to delete record from collection
Parameters
----------
collection_name: collection name
set_dict: new values
where_dict: condition as dict
"""
try:
collection = self.db[collection_name]
collection.update_many(where_dict, {"$set": set_dict} )
self.logger.log("info", f"update successfully") # logging
except Exception as e:
self.logger.log("error", f"update error : {str(e)}") # logging
def delete(self, collection_name: str, where_dict: Dict[Any, Any]):
"""
Function delete is used to delete record from collection
Parameters
----------
collection_name: collection name
where_dict: condition as dict
"""
try:
query_to_delete = where_dict
collection = self.db[collection_name]
collection.delete_one(query_to_delete)
self.logger.log("info", f"deleted successfully") # logging
except Exception as e:
self.logger.log("error", f"delete error : {str(e)}") # logging
def download(self, collection_name: str)-> str:
# make an API call to the MongoDB server
collection = self.db[collection_name]
mongo_docs = collection.find()
# Convert the mongo docs to a DataFrame
docs = pd.DataFrame(mongo_docs)
# Discard the Mongo ID for the documents
docs.pop("_id")
docs.to_csv(f"{collection_name}.csv", index=False)
return f"{collection_name}.csv"
The Main Method
if __name__ == '__main__':
connection_url = "" # mongodb connection URL
db_name = "" # db Name
collection_name = "" # mongodb collection name
db = MongoDB(connection_url, db_name)
# Create Collection
db.create_collection(collection_name)
# Insert new record
record = {
"username": "shyam",
"name": {
"first_name": "Shyam",
"last_name": "Sharma"
},
"gender": "Male",
"age": 23,
"skills": [
"python",
"java",
"SQL"
]
}
db.insert(collection_name, record)
# Update Collection
where_dict = { "name": { "$regex": "^Shyam" } }
set_dict = { "$set": { "name": "Shyam" } }
db.update(collection_name, set_dict, where_dict)
where_dict = { "username": "shyam" }
# Delete Collection
db.delete(collection_name, where_dict)
# Download Collection
db.download(collection_name)