capture file name as variable based on substring in python - Stack Overflow

admin2025-05-01  0

I have a below scenario in python. I want to check the current working directory and the files present in that directory and create a variable.

I have done like below

import os

# current working directory
print(os.getcwd())

# files present in that directory
dir_contents = os.listdir('.')


# below is the output of the dir_contents

print(dir_contents)
['.test.json.crc', '.wf_list_dir_param.py.crc', 'test.json', 'wf_list_dir_param.py']

Now from the dir_contents list I want to extract the wf_list_dir as a variable.

I need to do below

  1. find out the elements that start with wf and end with param.py
  2. extract everything before _param.py as a variable

How do I do that?

I have a below scenario in python. I want to check the current working directory and the files present in that directory and create a variable.

I have done like below

import os

# current working directory
print(os.getcwd())

# files present in that directory
dir_contents = os.listdir('.')


# below is the output of the dir_contents

print(dir_contents)
['.test.json.crc', '.wf_list_dir_param.py.crc', 'test.json', 'wf_list_dir_param.py']

Now from the dir_contents list I want to extract the wf_list_dir as a variable.

I need to do below

  1. find out the elements that start with wf and end with param.py
  2. extract everything before _param.py as a variable

How do I do that?

Share Improve this question edited Jan 2 at 17:43 Barmar 785k57 gold badges548 silver badges660 bronze badges asked Jan 2 at 17:12 nmrnmr 7638 silver badges27 bronze badges 1
  • 1 Are you familiar with regular expressions or pathlib.Path.glob? You may find this and/or this helpful. Also, you'll need to bear in mind that any filename that ends with "param.py" won't necessarily have anything preceding "_param.py" – Adon Bilivit Commented Jan 2 at 17:27
Add a comment  | 

2 Answers 2

Reset to default 3

You can achieve this with pathlib.Path.glob as follows:

from pathlib import Path

prefix = "wf"
suffix = "_param.py"
source_dir = Path(".")

for file in source_dir.glob(f"{prefix}*{suffix}"):
    print(file.name[:-len(suffix)])

Thus, if your current working directory contains a file named wf_list_dir_param.py this will emit wf_list_dir

I think the easiest way forward is to use the re module, but you might also think about just using str.startswith() and str.endswith() as a starting point.

Anyways, given you list of file you can use a regex pattern like r"^wf_(.+)_param\.py$" to both identify matches and pick out the part you want.

Maybe like:

import re

filenames = [".test.json.crc", ".wf_list_dir_param.py.crc", "test.json", "wf_list_dir_param.py"]
pattern = re.compile(r"^wf_(.+)_param\.py$")
for text in filenames:
    match = pattern.search(text)
    if match:
        print(match.group(1))   

This should give you:

list_dir

More about the pattern per regex101.com

^      asserts position at start of a line
wf_    matches the characters wf_ literally (case sensitive)
1st Capturing Group (.+)
       . matches any character (except for line terminators)
       + matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
_param matches the characters _param literally (case sensitive)
\.     matches the character . with index 4610 (2E16 or 568) literally (case sensitive)
py     matches the characters py literally (case sensitive)
$      asserts position at the end of a line

If you happen to actually want the "wf_" as part of the result, you can move that into your capture.

pattern = re.compile(r"^(wf_.+)_param\.py$")

With this updated pattern, the original code should give you:

wf_list_dir
转载请注明原文地址:http://anycun.com/QandA/1746105793a91750.html