Pandas, Recipes
Misc
- {{sketch}}: AI code-writing assistant for pandas users that understands the context of your data, greatly improving the relevance of suggestions. Sketch is usable in seconds and doesn’t require adding a plugin to your IDE. It’s just a regular function + method.
Transformations
Bin a numeric
= pd.DataFrame({"value": np.random.randint(0, 100, 20){style='color: #990000'}[}]{style='color: #990000'}) df = ["{0} - {1}".format(i, i + 9) for i in range(0, 100, 10)] labels "group"] = pd.cut(df.value, range(0, 105, 10), right=False, labels=labels) df[10) df.head( value group0 65 60 - 69 1 49 40 - 49 2 56 50 - 59 3 43 40 - 49 4 43 40 - 49 5 91 90 - 99 6 32 30 - 39 7 87 80 - 89 8 36 30 - 39 9 8 0 - 9
qcut
will create labels, so this probably isn’t needed
Strings
Generate list of strings from a variable
= pd.Series(["a", "b", "c", "a"], dtype="category") s = ["Group %s" % g for g in s.cat.categories] new_categories new_categories'Group a', 'Group b', 'Group c'] [
Extract city, state, and zip code from an address variable
Comparisons
equals
# series = pd.Series([1,2,3,4]) series1 = pd.Series([2,1,3,4]) series2 series1.equals(series2) # dfs "device_id"].equals(df1["device_id"]) df[#> True # List of the columns having different values in the DataFrames df1 and df for column in df.columns: if df[column].equals(df1[column]): pass else: print(column)
- Flags differences in order, dimensions, and of course, differences in data
compare
= df.compare(df1) df4 df4
- device-temperature and device-status are the two common columns being compared
- self indicates the first DataFrame df and other indicates the other DataFrame df1.
- Essentially merges both the DataFrames and adds a MultiIndex to show both the DataFrames columns side by side, which helps you to see the columns and positions where the values have been changed.