python - Utilizing a loop or automated method instead of manually typing out each dataframe column header and type to fit a patt

admin2025-04-17  3

Summary: I want to be able to create list logic that can read any dataframe (varying column headers, data types etc) and assign it into the following structure list

[{'Name': column header 1 name, 'Type': column header 1 type, 'Name': column header 2 name, 'Type': column header 2 name}]

I have a sample table df_stack_exchange

data_stack_exchange = {'store': ['A','B', 'B', 'C', 'C', 'C', 'D', 'D', 'D', 'D'],
        'worker': [1,1,2,1,2,3,1,2,3,4],
        'boxes': [105, 90, 100, 80, 10, 200, 70, 210, 50, 0]}
df_stack_exchange = pandas.DataFrame(data_stack_exchange)

And want to create the following. I know it is not a functional dictionary with the duplicated keys but its for another defined purpose that requires this rigid structure

[{'Name': 'store', 'Type': 'object', 'Name': 'worker', 'Type': 'int64', 'Name': 'boxes', 'Type': 'int64'}]

Edit 1: The end state is I want to automate filling out the InputColumns on the create_data_set boto3 command without having to type out each column of my s3 file: .html

'InputColumns': [
{
'Name': 'string',
'Type': 'STRING'|'INTEGER'|'DECIMAL'|'DATETIME'|'BIT'|'BOOLEAN'|'JSON',
'SubType': 'FLOAT'|'FIXED'
},
]

Summary: I want to be able to create list logic that can read any dataframe (varying column headers, data types etc) and assign it into the following structure list

[{'Name': column header 1 name, 'Type': column header 1 type, 'Name': column header 2 name, 'Type': column header 2 name}]

I have a sample table df_stack_exchange

data_stack_exchange = {'store': ['A','B', 'B', 'C', 'C', 'C', 'D', 'D', 'D', 'D'],
        'worker': [1,1,2,1,2,3,1,2,3,4],
        'boxes': [105, 90, 100, 80, 10, 200, 70, 210, 50, 0]}
df_stack_exchange = pandas.DataFrame(data_stack_exchange)

And want to create the following. I know it is not a functional dictionary with the duplicated keys but its for another defined purpose that requires this rigid structure

[{'Name': 'store', 'Type': 'object', 'Name': 'worker', 'Type': 'int64', 'Name': 'boxes', 'Type': 'int64'}]

Edit 1: The end state is I want to automate filling out the InputColumns on the create_data_set boto3 command without having to type out each column of my s3 file: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/quicksight/client/create_data_set.html

'InputColumns': [
{
'Name': 'string',
'Type': 'STRING'|'INTEGER'|'DECIMAL'|'DATETIME'|'BIT'|'BOOLEAN'|'JSON',
'SubType': 'FLOAT'|'FIXED'
},
]
Share Improve this question edited Jan 31 at 0:15 qwerty12 asked Jan 30 at 22:08 qwerty12qwerty12 555 bronze badges 2
  • Your expected result is not syntactically acceptable in Python. You could produce it as a String but that probably is not what you intend. – user19077881 Commented Jan 30 at 23:49
  • Ah I see. My goal is slot it into the InputColumns for create_data_set: boto3.amazonaws.com/v1/documentation/api/latest/reference/… – qwerty12 Commented Jan 31 at 0:13
Add a comment  | 

2 Answers 2

Reset to default 0

Another dataframe might facilitate subsequent operations more easily.

df_new = df_stack_exchange.dtypes.to_frame('Type').reset_index()
df_new.rename(columns={'index':'Name'}, inplace=True)
df_new

    Name    Type
0   store   object
1   worker  int64
2   boxes   int64

Bit late to the party, but what you are really missing is a mapping from the pandas columns' types to the types you provide in your edit.

I've written the below dictionary to do that, and to build a string like you specified. You'll want to modify it to add SubType where necessary, and may want to modify it to construct a json instead of a string.

mapping_dict = {
"object": "STRING",
"string": "STRING",
"boolean": "BOOLEAN",
"category": "STRING",
"int8": "INTEGER",
"int16": "INTEGER",
"int32": "INTEGER",
"int64": "INTEGER",
"uint8": "INTEGER",
"uint16": "INTEGER",
"uint32": "INTEGER",
"uint64": "INTEGER",
"float16": "DECIMAL",
"float32": "DECIMAL",
"float64": "DECIMAL",
"datetime64": "DATETIME",
"timedelta64": "STRING",
"bool": "BIT",
"str": "STRING",
"unicode": "STRING",
"bytes": "STRING",
"dict": "JSON",
"list": "JSON",
}


result_string=""
for name, dtype in df_stack_exchange.dtypes.to_dict().items():
    result_string +=(f'{{Name: {name}, Type: {mapping_dict[str(dtype)]}}}, ')

print(result_string)
转载请注明原文地址:http://anycun.com/QandA/1744888280a89044.html