经过 CSV 到 JSON 使用 BASH

我正试图隐藏以下内容 csv 格式化 json.


Africa,Kenya,NAI,281
Africa,Kenya,NAI,281
Asia,India,NSI,100
Asia,India,BSE,160
Asia,Pakistan,ISE,100
Asia,Pakistan,ANO,100
European Union,United Kingdom,LSE,100


这是所需的格式 json, 我只是无法创造它。 下面我发布了我未完成的工作..任何帮助或方向都将受到赞赏......


{"name":"Africa",
"children":[
{"name":"Kenya",
"children":[
{"name":"NAI","size":"109"},
{"name":"NAA","size":"160"}]}]},
{"name":"Asia",
"children":[
{"name":"India",
"children":[
{"name":"NSI","size":"100"},
{"name":"BSE","size":"60"}]},
{"name":"Pakistan",
"children":[
{"name":"ISE","size":"120"},
{"name":"ANO","size":"433"}]}]},
{"name":"European Union",
"children":[
{"name":"United Kingdom",
"children":[
{"name":"LSE","size":"550"},
{"name":"PLU","size":"123"}]}]}


不完整的工作。

$ 1是具有值的文件 csv, 插入上面。


#!/bin/bash

pcountry=$/head -1 $1 | cut -d, -f2/

cat $1 | while read line ; do

region=$/echo $line|cut -d, -f1/
country=$/echo $line|cut -d, -f2/
code=$/echo $line|cut -d, -f3-/
size=$/echo $line|cut -d, -f4/

if test "$pcountry" == "$country" ;
then
echo -e {\"name\":\"$region\", '\n' \"children\": [ '\n'{\"name\":\"$country\",'\n'\"children\": [ '\n' \{\"name\":\"NAI\",\"size\":\"$size\"\}
else
if test "$pregion" == "$region"
then :
else
echo -e ,'\n'{\"name\":\""$region\", '\n' \"children\": [ '\n'{\"name\":\"$country\",'\n'\"children\": [ '\n' \{\"name\":\"NAI\",\"size\":\"$size\"\},


pcountry=$country
pregion=$region

fi ; done


问题是,当国家结束的价值时,我看起来我找不到一种方法。
已邀请:

喜特乐

赞同来自:

作为一系列评论员,使用 shell 对于这种转换可怕的想法。 只有这样做几乎是不可能的
bash

嵌入式命令; 和脚本 shell 用于组合标准命令 unix, 如
sed

,
awk

,
cut

等等。您必须选择为这种迭代分析创建的最佳语言。/处理以解决您的问题。

然而,由于为时已晚,我喝了太多的咖啡,我收集了一个剧本
bash

/

有几个位
sed

, 添加以帮助句法

分析/, 举例说明
.csv

您拥有的数据和显示 JSON 以您指出的格式。 这是一个场景:


#! /bin/bash 
# Initial input file format:
#
# Africa,Kenya,NAI,281
# Africa,Kenya,NAA,281
# Asia,India,NSI,100
# Asia,India,BSE,160
# Asia,Pakistan,ISE,100
# Asia,Pakistan,ANO,100
# European Union,United Kingdom,LSE,100
#
# Intermediate file format for parsing to JSON:
#
# Africa|Kenya:NAI=281
# Asia|India:BSE=160&NSI=100|Pakistan:ISE=100&ANO=100
# European Union|United Kingdom:LSE=100
#
# Call as:
#
# $ ./script INPUTFILE.csv >OUTPUTFILE.json
#


# temporary files for output/parsing
TMP="./tmp.dat"
TMP2="./tmp2.dat"
>$TMP
>$TMP2

# read through initial file and output intermediate format
while read line
do
region=$/echo $line | cut -d, -f1/
country=$/echo $line | cut -d, -f2/
code=$/echo $line | cut -d, -f3/
size=$/echo $line | cut -d, -f4/

# region record already started
if grep "^$region" $TMP 2>&1 >/dev/null ;then
>$TMP2
while read rec
do
if echo $rec | grep "^$region" 2>&1 >/dev/null
then
if echo "$rec" | grep "\|$country:" 2>&1 >/dev/null
then
echo "$rec" | sed -e 's/\/'"$country"':[^\|][^\|]*\//\1\&'"$code"'='"$size"'/' >>$TMP2
else
echo "$rec|$country:$code=$size" >>$TMP2
fi
else
echo $rec >>$TMP2
fi
done < $TMP
mv $TMP2 $TMP
else
# new region
echo "$region|$country:$code=$size" >>$TMP
fi

done < $1

# Parse through our intermediary format and output JSON to standard out
echo "["
country_count=$/cat $TMP | wc -l/
while read line
do
country=$/echo $line | cut -d\| -f1/
echo "{ \"name\": \"$country\", "
echo " \"children\": ["
region_count=$/echo $line | cut -d\| -f2- | sed -e 's/|/\n/g' | wc -l/
echo $line | cut -d\| -f2- | sed -e 's/|/\n/g' |
while read region
do
name=$/echo $region | cut -d: -f1/
echo " { \"name\": \"$name\", "
echo " \"children\": ["
code_count=$/echo $region | sed -e 's/^'"$name"'://' -e 's/&/\n/g' | wc -l/
echo $region | sed -e 's/^'"$name"'://' -e 's/&/\n/g' |
while read code_size
do
code=$/echo $code_size | cut -d= -f1/
size=$/echo $code_size | cut -d= -f2/
code_count=$//code_count - 1//
COMMA=""
if [ $code_count -gt 0 ]; then
COMMA=","
fi
echo " { \"name\": \"$code\", \"size\": \"$size\" }$COMMA "
done
echo " ]"
region_count=$//region_count - 1//
if [ $region_count -gt 0 ]; then
echo " },"
else
echo " }"
fi
done
echo " ]"
country_count=$//country_count - 1//
COMMA=""
if [ $country_count -gt 0 ]; then
COMMA=","
fi
echo "}$COMMA"

done < $TMP
echo "]"

exit 0


这里是上面脚本的结果输出:


[
{ "name": "Africa",
"children": [
{ "name": "Kenya",
"children": [
{ "name": "NAI", "size": "281" },
{ "name": "NAA", "size": "281" }
]
}
]
},
{ "name": "Asia",
"children": [
{ "name": "India",
"children": [
{ "name": "NSI", "size": "100" },
{ "name": "BSE", "size": "160" }
]
},
{ "name": "Pakistan",
"children": [
{ "name": "ISE", "size": "100" },
{ "name": "ANO", "size": "100" }
]
}
]
},
{ "name": "European Union",
"children": [
{ "name": "United Kingdom",
"children": [
{ "name": "LSE", "size": "100" }
]
}
]
}
]


请勿在任何工业环境中使用类似上面类似的代码。

詹大官人

赞同来自:

这是一个解决方案
https://stedolan.github.io/jq/
.

如果一个
filter.jq

包含以下过滤器


reduce /
split/"\n"/[] # split string into lines
| split/","/ # split data
| select/length>0/ # eliminate blanks
/ as [$c1,$c2,$c3,$c4] / # convert to object
{} # e.g. "Africa": { "Kenya": {
; setpath/[$c1,$c2,"name"];$c3/ # "name": "NAI",
| setpath/[$c1,$c2,"size"];$c4/ # "size": "281"
/ # }, }
| [ # then build final array of objects format:
keys[] as $k1 # [ {
| {name: $k1, children: / # "name": "Africa",
.[$k1] # "children": {
| keys[] as $k2 # "name": "Kenya",
| {name: $k2, children:.[$k2]} # "children": { "name": "NAI", "size": "281" }
/} # ...
]



data

包含数据示例,然后命令


$ jq -M -Rsr -f filter.jq data


生产


[
{
"name": "Africa",
"children": {
"name": "Kenya",
"children": {
"name": "NAI",
"size": "281"
}
}
},
{
"name": "Asia",
"children": {
"name": "India",
"children": {
"name": "BSE",
"size": "160"
}
}
},
{
"name": "Asia",
"children": {
"name": "Pakistan",
"children": {
"name": "ANO",
"size": "100"
}
}
},
{
"name": "European Union",
"children": {
"name": "United Kingdom",
"children": {
"name": "LSE",
"size": "100"
}
}
}
]

要回复问题请先登录注册