我希望你一切都好。我面临着与解析器相关的一些困难。实际上,我的数据集如下所示:
<?xml version="1.0"?>
<bugrepository name="AspectJ">
<bug id="28974" opendate="2003-1-3 10:28:00" fixdate="2003-1-14 14:30:00">
<buginformation>
<summary>"Compiler error when introducing a ""final"" field"</summary>
<description>The aspecs the problem...</description>
</buginformation>
<fixedFiles>
<file>org.aspectj/modules/weaver/src/org/aspectj/weaver/AjcMemberMaker.java</file>
</fixedFiles>
</bug>
<bug id="28919" opendate="2002-12-30 16:40:00" fixdate="2003-1-14 15:06:00">
<buginformation>
<summary>waever tries to weave into native methods ...</summary>
<description>If youat org.aspectj.ajdt.internal.core.burce</description>
</buginformation>
<fixedFiles>
<file>org.aspectj/modules/weaver/src/org/aspectj/weaver/bcel/LazyMethodGen.java</file>
</fixedFiles>
</bug>
<bug id="29186" opendate="2003-1-8 21:22:00" fixdate="2003-1-14 16:43:00">
<buginformation>
<summary>ajc -emacssym chokes on pointcut that includes an intertype method</summary>
<description>This ;void Foo.ajc$before$Foo</description>
</buginformation>
<fixedFiles>
<file>org.aspectj/modules/weaver/src/org/aspectj/weaver/Lint.java</file>
<file>org.aspectj/modules/weaver/src/org/aspectj/weaver/Shadow.java</file>
<file>org.aspectj/modules/weaver/src/org/aspectj/weaver/bcel/BcelWeaver.java</file>
</fixedFiles>
</bug>
<bug id="29769" opendate="2003-1-19 11:42:00" fixdate="2003-1-24 21:17:00">
<buginformation>
<summary>Ajde does not support new AspectJ 1.1 compiler options</summary>
<description>The org.aspectj.ajpiler. This enhancement is needed byort.</description>
</buginformation>
<fixedFiles>
<file>org.aspectj/modules/ajde/testdata/examples/figures-coverage/figures/Figure.java</file>
<file>org.aspectj/modules/ajde/testsrc/org/aspectj/ajde/AjdeTests.java</file>
<file>org.aspectj/modules/ajde/testsrc/org/aspectj/ajde/ui/StructureViewManagerTest.java</file>
<file>org.aspectj/modules/org.aspectj.ajdt.core/src/org/aspectj/ajdt/ajc/BuildArgParser.java</file>
<file>org.aspectj/modules/org.aspectj.ajdt.core/src/org/aspectj/ajdt/internal/core/builder/AjBuildConfig.java</file>
<file>org.aspectj/modules/org.aspectj.ajdt.core/testsrc/org/aspectj/ajdt/ajc/BuildArgParserTestCase.java</file>
</fixedFiles>
</bug>
<bug id="29959" opendate="2003-1-22 7:10:00" fixdate="2003-2-13 16:00:00">
<buginformation>
<summary>super call in intertype method declaration body causes VerifyError</summary>
<description>AspectJ Compiler 1.1 showstopper</description>
</buginformation>
<fixedFiles>
<file>org.aspectj/modules/org.aspectj.ajdt.core/src/org/compiler/ast/InterTypeConstructorDeclaration.java</file>
<file>org.aspectj/modules/org.aspectj.ajdt.core/src/org/aspectj/ajdt/internal/compiler/ast/SuperFixerVisitor.java</file>
<file>org.aspectj/modules/org.aspectj.ajdt.core/src/org/aspectj/ajdt/internal/compiler/lookup/InterTypeMethodBinding.java</file>
<file>org.aspectj/modules/tests/bugs/SuperToIntro.java</file>
</fixedFiles>
</bug>
</bugrepository>
我希望能够恢复数据集的某些元素,以便在dataframe中与Pandas一起使用它们。
第一个问题是以列表形式从标记中获取所有子元素。
实际上,我的代码只检索第一个元素,忽略其他元素,或者可以检索所有元素,但不像在这些图片中看到的那样结构化:这里只有空([])列表没有内容。
守则:
import pandas as pd
from xml.etree.ElementTree import parse
document = parse('dataset.xml')
summary = []
description = []
fixedfile = []
for item in document.iterfind('bug'):
summary.append(item.findtext('buginformation/summary'))
description.append(item.findtext('buginformation/description'))
fixedfile.append(item.findall('fixedFiles/file'))
#df = pd.DataFrame({'summary':summary, 'description':description, 'fixed_files':fixedfile})
df = pd.DataFrame({'fixed_files': fixedfile})
df
守则:
import pandas as pd
from xml.etree.ElementTree import parse
document = parse('dataset.xml')
summary = []
description = []
fixedfile = []
for item in document.iterfind('bug'):
summary.append(item.findtext('buginformation/summary'))
description.append(item.findtext('buginformation/description'))
fixedfile.append(item.findtext('fixedFiles/file'))
#df = pd.DataFrame({'summary':summary, 'description':description, 'fixed_files':fixedfile})
df = pd.DataFrame({'fixed_files': fixedfile})
df
我在这里发现了一个适合我的情况的“使用Python遍历xml.etree.ElementTree树的问题”解决方案,它可以工作,但不像我想要的那样(每个元素的列表),我可以加载所有的元素,但是可以单独加载。
守则:
import xml.etree.ElementTree as ET
import pandas as pd
xmldoc = ET.parse('dataset.xml')
root = xmldoc.getroot()
summary = []
description = []
fixedfile = []
for bug in xmldoc.iter(tag='bug'):
#for item in document.iterfind('bug'):
#summary.append(item.findtext('buginformation/summary'))
#description.append(item.findtext('buginformation/description'))
for file in bug.iterfind('./fixedFiles/file'):
fixedfile.append([file.text])
fixedfile
#df = pd.DataFrame({'summary':summary, 'description':description, 'fixed_files':fixedfile})
df = pd.DataFrame({'fixed_files': fixedfile})
df
当我想迭代我的数据的其他列(摘要,描述)时,我得到以下错误消息: ValueError:所有数组必须具有相同的长度
第二个问题,例如能够选择所有有2或3个子元素的标记。
诚挚的问候,
发布于 2021-09-12 18:04:45
若要将文件保存在与描述和摘要关联的列表中,请将它们添加到每个错误的新列表中。
Try:
import pandas as pd
from xml.etree.ElementTree import parse
document = parse('dataset.xml')
summary = []
description = []
fixedfile = []
for item in document.iterfind('bug'):
summary.append(item.findtext('buginformation/summary'))
description.append(item.findtext('buginformation/description'))
fixedfile.append([elt.text for elt in item.findall('fixedFiles/file')])
df = pd.DataFrame({'summary': summary,
'description': description,
'fixed_files': fixedfile})
df
对于第二部分,这将只过滤那些有两个或更多文件的bug。
newdf = df[df.fixed_files.str.len() >= 2]
如果想要有2和3个文件的bug,那么:
newdf = df[(df.fixed_files.str.len() == 2) | (df.fixed_files.str.len() == 3)]
发布于 2021-09-12 18:10:02
下面收集数据。这样做的目的是找到所有的bug
元素并对它们进行迭代。对于每个bug
-查找所需的子元素。
import xml.etree.ElementTree as ET
import pandas as pd
xml = '''<?xml version="1.0"?>
<bugrepository name="AspectJ">
<bug id="28974" opendate="2003-1-3 10:28:00" fixdate="2003-1-14 14:30:00">
<buginformation>
<summary>"Compiler error when introducing a ""final"" field"</summary>
<description>The aspecs the problem...</description>
</buginformation>
<fixedFiles>
<file>org.aspectj/modules/weaver/src/org/aspectj/weaver/AjcMemberMaker.java</file>
</fixedFiles>
</bug>
<bug id="28919" opendate="2002-12-30 16:40:00" fixdate="2003-1-14 15:06:00">
<buginformation>
<summary>waever tries to weave into native methods ...</summary>
<description>If youat org.aspectj.ajdt.internal.core.burce</description>
</buginformation>
<fixedFiles>
<file>org.aspectj/modules/weaver/src/org/aspectj/weaver/bcel/LazyMethodGen.java</file>
</fixedFiles>
</bug>
<bug id="29186" opendate="2003-1-8 21:22:00" fixdate="2003-1-14 16:43:00">
<buginformation>
<summary>ajc -emacssym chokes on pointcut that includes an intertype method</summary>
<description>This ;void Foo.ajc$before$Foo</description>
</buginformation>
<fixedFiles>
<file>org.aspectj/modules/weaver/src/org/aspectj/weaver/Lint.java</file>
<file>org.aspectj/modules/weaver/src/org/aspectj/weaver/Shadow.java</file>
<file>org.aspectj/modules/weaver/src/org/aspectj/weaver/bcel/BcelWeaver.java</file>
</fixedFiles>
</bug>
<bug id="29769" opendate="2003-1-19 11:42:00" fixdate="2003-1-24 21:17:00">
<buginformation>
<summary>Ajde does not support new AspectJ 1.1 compiler options</summary>
<description>The org.aspectj.ajpiler. This enhancement is needed byort.</description>
</buginformation>
<fixedFiles>
<file>org.aspectj/modules/ajde/testdata/examples/figures-coverage/figures/Figure.java</file>
<file>org.aspectj/modules/ajde/testsrc/org/aspectj/ajde/AjdeTests.java</file>
<file>org.aspectj/modules/ajde/testsrc/org/aspectj/ajde/ui/StructureViewManagerTest.java</file>
<file>org.aspectj/modules/org.aspectj.ajdt.core/src/org/aspectj/ajdt/ajc/BuildArgParser.java</file>
<file>org.aspectj/modules/org.aspectj.ajdt.core/src/org/aspectj/ajdt/internal/core/builder/AjBuildConfig.java</file>
<file>org.aspectj/modules/org.aspectj.ajdt.core/testsrc/org/aspectj/ajdt/ajc/BuildArgParserTestCase.java</file>
</fixedFiles>
</bug>
<bug id="29959" opendate="2003-1-22 7:10:00" fixdate="2003-2-13 16:00:00">
<buginformation>
<summary>super call in intertype method declaration body causes VerifyError</summary>
<description>AspectJ Compiler 1.1 showstopper</description>
</buginformation>
<fixedFiles>
<file>org.aspectj/modules/org.aspectj.ajdt.core/src/org/compiler/ast/InterTypeConstructorDeclaration.java</file>
<file>org.aspectj/modules/org.aspectj.ajdt.core/src/org/aspectj/ajdt/internal/compiler/ast/SuperFixerVisitor.java</file>
<file>org.aspectj/modules/org.aspectj.ajdt.core/src/org/aspectj/ajdt/internal/compiler/lookup/InterTypeMethodBinding.java</file>
<file>org.aspectj/modules/tests/bugs/SuperToIntro.java</file>
</fixedFiles>
</bug>
</bugrepository>'''
data = []
root = ET.fromstring(xml)
for bug in root.findall('.//bug'):
bug_info = bug.find('buginformation')
fixed_files = bug.find('fixedFiles')
entry = {'summary': bug_info.find('summary').text,'description':bug_info.find('summary').text,'fixedFiles':[x.text for x in list(fixed_files)]}
data.append(entry)
for entry in data:
print(entry)
df = pd.DataFrame(data)
输出
{'summary': '"Compiler error when introducing a ""final"" field"', 'description': '"Compiler error when introducing a ""final"" field"', 'fixedFiles': ['org.aspectj/modules/weaver/src/org/aspectj/weaver/AjcMemberMaker.java']}
{'summary': 'waever tries to weave into native methods ...', 'description': 'waever tries to weave into native methods ...', 'fixedFiles': ['org.aspectj/modules/weaver/src/org/aspectj/weaver/bcel/LazyMethodGen.java']}
{'summary': 'ajc -emacssym chokes on pointcut that includes an intertype method', 'description': 'ajc -emacssym chokes on pointcut that includes an intertype method', 'fixedFiles': ['org.aspectj/modules/weaver/src/org/aspectj/weaver/Lint.java', 'org.aspectj/modules/weaver/src/org/aspectj/weaver/Shadow.java', 'org.aspectj/modules/weaver/src/org/aspectj/weaver/bcel/BcelWeaver.java']}
{'summary': 'Ajde does not support new AspectJ 1.1 compiler options', 'description': 'Ajde does not support new AspectJ 1.1 compiler options', 'fixedFiles': ['org.aspectj/modules/ajde/testdata/examples/figures-coverage/figures/Figure.java', 'org.aspectj/modules/ajde/testsrc/org/aspectj/ajde/AjdeTests.java', 'org.aspectj/modules/ajde/testsrc/org/aspectj/ajde/ui/StructureViewManagerTest.java', 'org.aspectj/modules/org.aspectj.ajdt.core/src/org/aspectj/ajdt/ajc/BuildArgParser.java', 'org.aspectj/modules/org.aspectj.ajdt.core/src/org/aspectj/ajdt/internal/core/builder/AjBuildConfig.java', 'org.aspectj/modules/org.aspectj.ajdt.core/testsrc/org/aspectj/ajdt/ajc/BuildArgParserTestCase.java']}
{'summary': 'super call in intertype method declaration body causes VerifyError', 'description': 'super call in intertype method declaration body causes VerifyError', 'fixedFiles': ['org.aspectj/modules/org.aspectj.ajdt.core/src/org/compiler/ast/InterTypeConstructorDeclaration.java', 'org.aspectj/modules/org.aspectj.ajdt.core/src/org/aspectj/ajdt/internal/compiler/ast/SuperFixerVisitor.java', 'org.aspectj/modules/org.aspectj.ajdt.core/src/org/aspectj/ajdt/internal/compiler/lookup/InterTypeMethodBinding.java', 'org.aspectj/modules/tests/bugs/SuperToIntro.java']}
https://stackoverflow.com/questions/69153935
复制相似问题