adplus-dvertising

How do I parse multiple XML files and then save them with a partially new file name?

Asked 2 days ago
Viewed 6 times

I want to parse multiple xml files and than save them each with a new file name. But I want to keep the first part of the original file name.

For example, If I have the two following XML files

Batch123.cogx.xml

Batch321.cogx.xml

After parsing the files I want to save them as

Batch123.NO_validations.cogx.xml

Batch321.NO_validations.cogx.xml

Below is the code I have so far. If I run it and I have two XML files I get 'completed' twice in the terminal, but, I only get one new file and the new file name is '*.NO_validations.cogx.xml' not the original file name with the new ending....

def removevalidations(filename): 
    with open(filename, 'r', encoding="utf-8") as content:   
        elem = ET.parse(filename)
        root = elem.getroot()    
        elementName = "v"
        for elementParent in root.findall(".//{}/..".format(elementName)):
            for element in elementParent.findall("{}".format(elementName)):
                elementParent.remove(element)
        elem.write('*.NO_validations.cogx.xml')
        print('completed')

for filename in glob.glob('*.cogx.xml', recursive = True):
    removevalidations(filename)   

Thanks in advance!!!!

asked 2 days ago

Correct Answer

You need to explicitly construct the new filename, a glob (*) is only able to find existing files, can't be used like this for creating new ones. Probably the simplest way in this specific case is -

outfile = filename.split('.')[0] + '.NO_validations.cogx.xml'

Near the to of the function, then use that as the target filename in elem.write

answered 2 days ago