Universal Naming Convention (UNC)
UNC
A Universal Naming Convention, or UNC, string is used to specify the location of resources such as shared files or devices. It specifies a common syntax to describe the location of a network resource, such as a shared file, directory, or printer. The UNC syntax for Windows systems has the generic form:
\\ComputerName\SharedFolder\Resource
Useful Links
Schemas
There are three UNC schemes based on namespace selectors: filespace selector, Win32API selector, and device selector. Only the filespace selector is parsed for on-wire traffic, the other two pass opaque BLOBs to the consuming entity.
Filespace Selector
The filespace selector is a null-terminated Unicode character string in the following Augmented Backus-Naur Form (ABNF) syntax:
UNC = "\\" host-name "\" share-name [ "\" object-name ] host-name = "[" IPv6address ']" / IPv4address / reg-name; IPv6address, IPv4address, and reg-name as specified in [RFC3986] share-name = 1*80pchar pchar = %x20-21 / %x23-29 / %x2D-2E / %x30-39 / %x40-5A / %x5E-7B / %x7D-FF object-name = *path-name [ "\" file-name ] path-name = 1*255pchar file-name = 1*255fchar [ ":" stream-name [ ":" stream-type ] ] fchar = %x20-21 / %x23-29 / %x2B-2E / %x30-39 / %x3B / %x3D / %x40-5B / %x5D-7B / %x7D-FF stream-name = *schar schar = %x01-2E / %x30-39 / %x3B-5B /%x5D-FF stream-type = 1*schar
host-name: The host name of a server or the domain name of a domain hosting resource, using the syntax of IPv6address, IPv4address, and reg-name as specified in[RFC3986],. The host-name string MUST be a NetBIOS name as specified in [MS-NBTE] section 2.2.1, a fully qualified domain name (FQDN) as specified in [RFC1035] and [RFC1123], or an IPv4 address as specified in [RFC1123] section 2.1 or an IPv6 address as specified in [RFC4291] section 2.2.
share-name: The name of a share or a resource to be accessed. The format of this name depends on the actual file server protocol that is used to access the share. Examples of file server protocols include SMB (as specified in [MS-SMB]), NFS (as specified in [RFC3530]), and NCP (as specified in [NOVELL]).
object-name: The name of an object; this name depends on the actual resource
accessed. The notation "[\object-name]*
" indicates that zero or
more object names exist in the path, and each object-name is separated from
the immediately preceding object-name with a backslash path separator. In a
UNC path used to access files and directories in an SMB share, for example,
object-name can be the name of a file or a directory.
The host-name, share-name, and object-name are referred to as "pathname components" or "path components". A valid UNC path consists of two or more path components. The host-name is referred to as the "first pathname component", the share-name as the "second pathname component", and so on. The last component of the path is also referred to as the "leaf component".
The protocol that is used to access the resource, and the type of resource that is being accessed, define the size and valid characters for a path component. The only limitations that a Distributed File System (DFS) places on path components are that they MUST be at least one character in length and MUST NOT contain a backslash or null.
path-name: One or more pathname components separated by the "\" backslash character. All pathname components other than the last pathname component denote directories or reparse points.
file-name: The "leaf component" of the path, optionally followed by a ":" colon character and a stream-name , optionally followed by a ":" colon character and a stream type. The stream-name, if specified, MAY be zero-length only if stream-type is also specified; otherwise, it MUST be at least one character. The stream-type, if specified, MUST be at least one character.
Extract the Server and Share from a UNC Path using Regular Expressions
Problem
You have a string that holds a (syntactically) valid path to a file or folder
on a Windows PC or network. If the path is a UNC path, then you want to
extract the name of the network server and the share on the server that the
path points to. For example, you want to extract server and share from
\\server\share\folder\file.ext
.
Solution
^\\\\([a-z0-9_.$?-]+)\\([a-z0-9_.$?-]+)
Regex options: Case insensitive
Discussion
Extracting the network server and share from a string known to hold a valid path is easy, even if you don't know whether the path is a UNC path. The path could be a relative path or use a drive letter.
UNC paths begin with two backslashes. Two consecutive backslashes are not allowed in Windows paths, except to begin a UNC path. Thus, if a known valid path begins with two backslashes, we know that the server and share name must follow.
The anchor <^
> matches at the start of the string. The
fact that the caret also matches at embedded line breaks in Ruby doesn't
matter, because valid Windows paths don't include line breaks.
<\\\\
> matches two literal backslashes. Since the
backslash is a metacharacter in regular expressions, we have to escape a
backslash with another backslash if we want to match it as a literal
character. The first character class, <[a-z0-9_.$?-]+
>,
matches the name of the network server. The second one, after another literal
backslash, matches the name of the share. We place both character classes
between a pair of parentheses, which form a capturing group. That way you
can get the server name alone from the first capturing group, and the
share name alone from the second capturing group. The overall regex match
will be \\server\share
.