file - autoloader check for duplicates - Stack Overflow

admin2025-04-17  2

Does the autoloader in Databricks load the same file again if the file is loaded from another path ?

Or does it load the same file if the file is put again in the same directory after some time ?

Does the autoloader in Databricks load the same file again if the file is loaded from another path ?

Or does it load the same file if the file is put again in the same directory after some time ?

Share Improve this question edited Feb 3 at 8:53 Daksh Rawal 4115 silver badges20 bronze badges asked Feb 1 at 15:26 A_M_2020A_M_2020 112 bronze badges
Add a comment  | 

1 Answer 1

Reset to default 0

Does the autoloader in Databricks load the same file again

  1. if the file is loaded from another path ?

yes, even the same file irrespective of its content, filename, timestamp and other additional metadata, is loaded if the path mentioned is different

  1. if the file is added to the same directory after some time ?

yes, only if the contents or/and the filename has been changed, any changes or modification to the file is loaded again

basically it uses file fingerprint to avoid redundant loading of the same files, so when a file with different fingerprint is provided it loads the file

转载请注明原文地址:http://anycun.com/QandA/1744824928a88133.html