From cdb57900d40905a8cb0685bba807e4a6484475d2 Mon Sep 17 00:00:00 2001 From: Krishna Kaushik <131583096+kRiShNa-429407@users.noreply.github.com> Date: Fri, 24 May 2024 21:11:33 +0530 Subject: [PATCH] Delete contrib/pandas/Importing_and_Exporting_Data_in_Pandas.md --- .../Importing_and_Exporting_Data_in_Pandas.md | 273 ------------------ 1 file changed, 273 deletions(-) delete mode 100644 contrib/pandas/Importing_and_Exporting_Data_in_Pandas.md diff --git a/contrib/pandas/Importing_and_Exporting_Data_in_Pandas.md b/contrib/pandas/Importing_and_Exporting_Data_in_Pandas.md deleted file mode 100644 index 4d0ffad..0000000 --- a/contrib/pandas/Importing_and_Exporting_Data_in_Pandas.md +++ /dev/null @@ -1,273 +0,0 @@ -# Importing_and_Exporting_Data_in_Pandas - ->Created by Krishna Kaushik - -- **Now we're able to create `Series` and `DataFrames` in pandas, but we usually do not do this , in practice we import the data which is in the form of .csv (Comma Seperated Values) , a spreadsheet file or something similar.** - -- *Good news is that pandas allows for easy importing of data like this through functions such as ``pd.read_csv()`` and ``pd.read_excel()`` for Microsoft Excel files.* - -## 1. Importing from a Google sheet to a pandas dataframe - -*Let's say that you wanted to get the information from Google Sheet document into a pandas DataFrame.*. - -*You could export it as a .csv file and then import it using ``pd.read_csv()``.* - -*In this case, the exported .csv file is called `Titanic.csv`* - - -```python -## Importing Titanic Data set -import pandas as pd - -titanic_df= pd.read_csv("https://raw.githubusercontent.com/kRiShNa-429407/learn-python/main/contrib/pandas/Titanic.csv") -titanic_df -``` -
- | pclass | -survived | -name | -sex | -age | -sibsp | -parch | -ticket | -fare | -cabin | -embarked | -boat | -body | -home.dest | -
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | -1 | -1 | -Allen, Miss. Elisabeth Walton | -female | -29.00 | -0 | -0 | -24160 | -211.3375 | -B5 | -S | -2 | -NaN | -St Louis, MO | -
1 | -1 | -1 | -Allison, Master. Hudson Trevor | -male | -0.92 | -1 | -2 | -113781 | -151.5500 | -C22 C26 | -S | -11 | -NaN | -Montreal, PQ / Chesterville, ON | -
2 | -1 | -0 | -Allison, Miss. Helen Loraine | -female | -2.00 | -1 | -2 | -113781 | -151.5500 | -C22 C26 | -S | -NaN | -NaN | -Montreal, PQ / Chesterville, ON | -
3 | -1 | -0 | -Allison, Mr. Hudson Joshua Creighton | -male | -30.00 | -1 | -2 | -113781 | -151.5500 | -C22 C26 | -S | -NaN | -135.0 | -Montreal, PQ / Chesterville, ON | -
4 | -1 | -0 | -Allison, Mrs. Hudson J C (Bessie Waldo Daniels) | -female | -25.00 | -1 | -2 | -113781 | -151.5500 | -C22 C26 | -S | -NaN | -NaN | -Montreal, PQ / Chesterville, ON | -
... | -... | -... | -... | -... | -... | -... | -... | -... | -... | -... | -... | -... | -... | -... | -
1304 | -3 | -0 | -Zabour, Miss. Hileni | -female | -14.50 | -1 | -0 | -2665 | -14.4542 | -NaN | -C | -NaN | -328.0 | -NaN | -
1305 | -3 | -0 | -Zabour, Miss. Thamine | -female | -NaN | -1 | -0 | -2665 | -14.4542 | -NaN | -C | -NaN | -NaN | -NaN | -
1306 | -3 | -0 | -Zakarian, Mr. Mapriededer | -male | -26.50 | -0 | -0 | -2656 | -7.2250 | -NaN | -C | -NaN | -304.0 | -NaN | -
1307 | -3 | -0 | -Zakarian, Mr. Ortin | -male | -27.00 | -0 | -0 | -2670 | -7.2250 | -NaN | -C | -NaN | -NaN | -NaN | -
1308 | -3 | -0 | -Zimmerman, Mr. Leo | -male | -29.00 | -0 | -0 | -315082 | -7.8750 | -NaN | -S | -NaN | -NaN | -NaN | -
1309 rows × 14 columns
- - - - -The dataset I am using here for your reference is taken from the same repository i.e ``learn-python`` (https://raw.githubusercontent.com/kRiShNa-429407/learn-python/main/contrib/pandas/Titanic.csv) I uploaded it you can use it from there. - -**Now we've got the same data from the Google Spreadsheet , but now available as ``pandas DataFrame`` which means we can now apply all pandas functionality over it.** - -#### Note: The quiet important thing i am telling is that ``pd.read_csv()`` takes the location of the file (which is in your current working directory) or the hyperlink of the dataset from the other source. - -#### But if you want to import the data from Github you can't directly use its link , you have to first convert it to raw by clicking on the raw button present in the repo . - -#### Also you can't use the data directly from `Kaggle` you have to use ``kaggle API`` - -## 2. The Anatomy of DataFrame - -**Different functions use different labels for different things, and can get a little confusing.** - -- Rows are refer as ``axis=0`` -- columns are refer as ``axis=1`` - -## 3. Exporting Data - -**OK, so after you've made a few changes to your data, you might want to export it and save it so someone else can access the changes.** - -**pandas allows you to export ``DataFrame's`` to ``.csv`` format using ``.to_csv()``, or to a spreadsheet format using .to_excel().** - -### Exporting a dataframe to a CSV - -**We haven't made any changes yet to the ``titanic_df`` DataFrame but let's try to export it.** - - -```python -#Export the titanic_df DataFrame to csv -titanic_df.to_csv("exported_titanic.csv") -``` - -Running this will save a file called ``exported_titanic.csv`` to the current folder.