Database queries
diamond_miner.queries
Wrappers around ClickHouse SQL queries.
The queries operate on different kind of tables. Refer to the following superclasses for more information: Query, LinksQuery, PrefixesQuery, ProbesQuery, ResultsQuery.
Count
dataclass
Bases: Query
Count the number of rows returned by a given query.
Examples:
>>> from diamond_miner.test import client
>>> from diamond_miner.queries import GetLinks, GetNodes
>>> Count(query=GetNodes()).execute(client, 'test_nsdi_example')[0]["count()"]
7
>>> Count(query=GetLinks()).execute(client, 'test_nsdi_example')[0]["count()"]
8
Source code in diamond_miner/queries/count.py
query: Query | None = None
class-attribute
instance-attribute
The query for which to count the nodes.
CountLinksPerPrefix
dataclass
Bases: LinksQuery
Count the number of (non-distinct) links per prefix.
Examples:
>>> from diamond_miner.test import client
>>> from diamond_miner.queries import CountLinksPerPrefix
>>> rows = CountLinksPerPrefix().execute(client, 'test_nsdi_example')
>>> sorted((row["prefix"], row["count"]) for row in rows)
[('::ffff:200.0.0.0', 58)]
Source code in diamond_miner/queries/count_rows.py
prefix_len_v4: int = 16
class-attribute
instance-attribute
The IPv4 prefix length to consider.
prefix_len_v6: int = 8
class-attribute
instance-attribute
The IPv6 prefix length to consider.
CountProbesPerPrefix
dataclass
Bases: ProbesQuery
Count the number of probes sent per prefix.
Examples:
>>> from diamond_miner.test import client
>>> from diamond_miner.queries import CountProbesPerPrefix
>>> rows = CountResultsPerPrefix(round_eq=1).execute(client, 'test_nsdi_example')
>>> sorted((row["prefix"], row["count"]) for row in rows)
[('::ffff:200.0.0.0', 24)]
Source code in diamond_miner/queries/count_rows.py
prefix_len_v4: int = 16
class-attribute
instance-attribute
The IPv4 prefix length to consider.
prefix_len_v6: int = 8
class-attribute
instance-attribute
The IPv6 prefix length to consider.
CountResultsPerPrefix
dataclass
Bases: ResultsQuery
Count the number of results per prefix.
Examples:
>>> from diamond_miner.test import client
>>> from diamond_miner.queries import CountResultsPerPrefix
>>> rows = CountResultsPerPrefix(prefix_len_v4=8, prefix_len_v6=8).execute(client, 'test_count_replies')
>>> sorted((row["prefix"], row["count"]) for row in rows)
[('::ffff:1.0.0.0', 2), ('::ffff:2.0.0.0', 1), ('::ffff:204.0.0.0', 1)]
Source code in diamond_miner/queries/count_rows.py
prefix_len_v4: int = 16
class-attribute
instance-attribute
The IPv4 prefix length to consider.
prefix_len_v6: int = 8
class-attribute
instance-attribute
The IPv6 prefix length to consider.
CreateLinksTable
dataclass
Bases: Query
Create the links table containing one line per (flow, link) pair.
Examples:
>>> from diamond_miner.test import client
>>> from diamond_miner.queries import CreateLinksTable
>>> CreateLinksTable().execute(client, "test")
[]
Source code in diamond_miner/queries/create_links_table.py
storage_policy: StoragePolicy = StoragePolicy()
class-attribute
instance-attribute
ClickHouse storage policy to use.
CreatePrefixesTable
dataclass
Bases: Query
Create the table containing (invalid) prefixes.
Examples:
>>> from diamond_miner.test import client
>>> from diamond_miner.queries import CreatePrefixesTable
>>> CreatePrefixesTable().execute(client, "test")
[]
Source code in diamond_miner/queries/create_prefixes_table.py
SORTING_KEY = 'probe_protocol, probe_src_addr, probe_dst_prefix'
class-attribute
instance-attribute
Columns by which the data is ordered.
storage_policy: StoragePolicy = StoragePolicy()
class-attribute
instance-attribute
ClickHouse storage policy to use.
CreateProbesTable
dataclass
Bases: Query
Create the table containing the cumulative number of probes sent over the rounds.
Examples:
>>> from diamond_miner.test import client
>>> from diamond_miner.queries import CreateProbesTable
>>> CreateProbesTable().execute(client, "test")
[]
Source code in diamond_miner/queries/create_probes_table.py
SORTING_KEY = 'probe_protocol, probe_dst_prefix, probe_ttl'
class-attribute
instance-attribute
Columns by which the data is ordered.
storage_policy: StoragePolicy = StoragePolicy()
class-attribute
instance-attribute
ClickHouse storage policy to use.
CreateResultsTable
dataclass
Bases: Query
Create the table used to store the measurement results from the prober.
Examples:
>>> from diamond_miner.test import client
>>> from diamond_miner.queries import CreateResultsTable
>>> CreateResultsTable().execute(client, "test")
[]
Source code in diamond_miner/queries/create_results_table.py
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 |
|
SORTING_KEY = 'probe_protocol, probe_src_addr, probe_dst_prefix, probe_dst_addr, probe_src_port, probe_dst_port, probe_ttl'
class-attribute
instance-attribute
Columns by which the data is ordered.
prefix_len_v4: int = DEFAULT_PREFIX_LEN_V4
class-attribute
instance-attribute
The prefix length used to compute the IPv4 prefix of an IP address.
prefix_len_v6: int = DEFAULT_PREFIX_LEN_V6
class-attribute
instance-attribute
The prefix length used to compute the IPv6 prefix of an IP address.
storage_policy: StoragePolicy = StoragePolicy()
class-attribute
instance-attribute
ClickHouse storage policy to use.
CreateTables
dataclass
Bases: Query
Create the tables necessary for a measurement.
Examples:
>>> from diamond_miner.test import client
>>> from diamond_miner.queries import CreateTables
>>> CreateTables().execute(client, "test")
[]
Source code in diamond_miner/queries/create_tables.py
prefix_len_v4: int = DEFAULT_PREFIX_LEN_V4
class-attribute
instance-attribute
The prefix length used to compute the IPv4 prefix of an IP address.
prefix_len_v6: int = DEFAULT_PREFIX_LEN_V6
class-attribute
instance-attribute
The prefix length used to compute the IPv6 prefix of an IP address.
storage_policy: StoragePolicy = StoragePolicy()
class-attribute
instance-attribute
ClickHouse storage policy to use.
DropTables
dataclass
Bases: Query
Drop the tables associated to a measurement.
Examples:
>>> from diamond_miner.test import client
>>> from diamond_miner.queries import DropTables
>>> DropTables().execute(client, "test")
[]
Source code in diamond_miner/queries/drop_tables.py
GetInvalidPrefixes
dataclass
Bases: PrefixesQuery
Return the prefixes with unexpected behavior
(see GetPrefixesWithAmplification
and GetPrefixesWithLoops
.
Examples:
>>> from diamond_miner.test import client
>>> from diamond_miner.queries import GetInvalidPrefixes
>>> rows = GetInvalidPrefixes().execute(client, "test_invalid_prefixes")
>>> [x["probe_dst_prefix"] for x in rows]
['::ffff:201.0.0.0', '::ffff:202.0.0.0']
Source code in diamond_miner/queries/get_invalid_prefixes.py
GetLinks
dataclass
Bases: LinksQuery
Return the links pre-computed in the links table.
Examples:
>>> from diamond_miner.test import client
>>> from diamond_miner.queries import GetLinks
>>> links = GetLinks(filter_invalid_prefixes=False).execute(client, 'test_invalid_prefixes')
>>> len(links)
3
>>> links = GetLinks(filter_invalid_prefixes=True).execute(client, 'test_invalid_prefixes')
>>> len(links)
1
>>> links = GetLinks(include_metadata=False).execute(client, 'test_nsdi_example')
>>> len(links)
8
>>> links = GetLinks(include_metadata=True).execute(client, 'test_nsdi_example')
>>> len(links)
8
>>> links = GetLinks(near_or_far_addr="150.0.6.1").execute(client, 'test_nsdi_example')
>>> len(links)
3
Source code in diamond_miner/queries/get_links.py
filter_invalid_prefixes: bool = False
class-attribute
instance-attribute
If true, exclude links from prefixes with amplification or loops.
include_metadata: bool = False
class-attribute
instance-attribute
If true, include the TTLs at which near_addr
and far_addr
were seen.
GetLinksFromResults
dataclass
Bases: ResultsQuery
Compute the links from the results table.
This returns one line per (flow, link)
pair.
We do not emit a link in the case of single reply in a traceroute.
For example: * * node * *
, does not generate a link.
However, * * node * * node'
, will generate (node, *)
and (*, node')
.
We emit cross-rounds links.
For example if flow N sees node A at TTL 10 at round 1 and flow N sees node B at TTL 11 at round 2,
we will emit (1, 10, A) - (2, 11, B)
.
We assume that there exists a single (flow, ttl) pair over all rounds (TODO: assert this).
If round_eq
is none, compute the links per flow, across all rounds.
Otherwise, compute the links per flow, for the specified round.
This is useful if you want to update a links
table round-by-round:
such a table will contain only intra-round links but can be updated incrementally.
Examples:
>>> from diamond_miner.test import client
>>> from diamond_miner.queries import GetLinksFromResults
>>> links = GetLinksFromResults().execute(client, "test_nsdi_example")
>>> len(links)
58
Source code in diamond_miner/queries/get_links_from_results.py
ignore_invalid_prefixes: bool = True
class-attribute
instance-attribute
If true, exclude invalid prefixes from links computation.
GetMDAProbes
dataclass
Bases: LinksQuery
Return the number of probes to send per prefix and per TTL according to the Diamond-Miner algorithm.
Examples:
>>> from diamond_miner.test import client
>>> from diamond_miner.queries import GetMDAProbes
>>> GetMDAProbes(round_leq=1).execute(client, "test_nsdi_lite")
[{'probe_protocol': 1, 'probe_dst_prefix': '::ffff:200.0.0.0', 'cumulative_probes': [12, 12, 12, 12], 'TTLs': [1, 2, 3, 4]}]
Source code in diamond_miner/queries/get_mda_probes.py
dminer_lite: bool = True
class-attribute
instance-attribute
If true, use an heuristic that requires less probes to handle nested load-balancers.
target_epsilon: float = DEFAULT_FAILURE_RATE
class-attribute
instance-attribute
The desired failure rate of the MDA algorithm, that is, the probability of not detecting all the outgoing edges of a load-balancer for a given prefix and TTL.
GetNodes
dataclass
Bases: ResultsQuery
Return all the discovered nodes.
Examples:
>>> from diamond_miner.test import client
>>> from diamond_miner.queries import GetNodes
>>> nodes = GetNodes(include_probe_ttl=True).execute(client, 'test_nsdi_example')
>>> sorted((node["probe_ttl"], node["reply_src_addr"]) for node in nodes)
[(1, '::ffff:150.0.1.1'), (2, '::ffff:150.0.2.1'), (2, '::ffff:150.0.3.1'), (3, '::ffff:150.0.4.1'), (3, '::ffff:150.0.5.1'), (3, '::ffff:150.0.7.1'), (4, '::ffff:150.0.6.1')]
>>> nodes = GetNodes(filter_invalid_prefixes=False).execute(client, 'test_invalid_prefixes')
>>> sorted(node["reply_src_addr"] for node in nodes)
['::ffff:150.0.0.1', '::ffff:150.0.0.2', '::ffff:150.0.1.1', '::ffff:150.0.1.2', '::ffff:150.0.2.1', '::ffff:150.0.2.2', '::ffff:150.0.2.3']
>>> nodes = GetNodes(filter_invalid_prefixes=True).execute(client, 'test_invalid_prefixes')
>>> sorted(node["reply_src_addr"] for node in nodes)
['::ffff:150.0.0.1', '::ffff:150.0.0.2', '::ffff:150.0.2.3']
Source code in diamond_miner/queries/get_nodes.py
filter_invalid_prefixes: bool = False
class-attribute
instance-attribute
If true, exclude nodes from prefixes with amplification or loops.
include_probe_ttl: bool = False
class-attribute
instance-attribute
If true, include the TTL at which reply_src_addr
was seen.
GetPrefixes
dataclass
Bases: ResultsQuery
Return the destination prefixes for which replies have been received.
Examples:
>>> from diamond_miner.test import client
>>> from diamond_miner.queries import GetPrefixes
>>> from ipaddress import ip_network
>>> rows = GetPrefixes().execute(client, 'test_nsdi_example')
>>> len(rows)
1
>>> rows = GetPrefixes().execute(client, 'test_invalid_prefixes')
>>> len(rows)
3
>>> rows = GetPrefixes(reply_src_addr_in=ip_network("150.0.1.0/24")).execute(client, 'test_invalid_prefixes')
>>> len(rows)
1
Source code in diamond_miner/queries/get_prefixes.py
reply_src_addr_in: IPNetwork | None = None
class-attribute
instance-attribute
If specified, keep only the replies from this network.
GetPrefixesWithAmplification
dataclass
Bases: ResultsQuery
Return the prefixes for which we have more than one reply per (flow ID, TTL).
Important
This query assumes that a single probe is sent per (flow ID, TTL) pair.
Examples:
>>> from diamond_miner.test import client
>>> from diamond_miner.queries import GetPrefixesWithAmplification
>>> rows = GetPrefixesWithAmplification().execute(client, "test_invalid_prefixes")
>>> [x["probe_dst_prefix"] for x in rows]
['::ffff:202.0.0.0']
Source code in diamond_miner/queries/get_invalid_prefixes.py
GetPrefixesWithLoops
dataclass
Bases: ResultsQuery
Return the prefixes for which an IP address appears multiple time for a single flow ID.
Important
Prefixes with amplification (multiple replies per probe) may trigger a false positive for this query, since we do not check that the IP appears at two different TTLs.
Examples:
>>> from diamond_miner.test import client
>>> from diamond_miner.queries import GetPrefixesWithLoops
>>> GetPrefixesWithLoops().execute(client, "test_invalid_prefixes")
[{'probe_protocol': 1, 'probe_src_addr': '::ffff:100.0.0.1', 'probe_dst_prefix': '::ffff:201.0.0.0', 'has_loops': 1}]
Source code in diamond_miner/queries/get_invalid_prefixes.py
GetProbes
Bases: ProbesQuery
Return the cumulative number of probes sent at a specified round.
Examples:
>>> from diamond_miner.test import client
>>> from diamond_miner.queries import GetProbes
>>> row = GetProbes(round_eq=1).execute(client, 'test_nsdi_example')[0]
>>> row["probe_protocol"]
1
>>> row["probe_dst_prefix"]
'::ffff:200.0.0.0'
>>> sorted(row["probes_per_ttl"])
[[1, 6], [2, 6], [3, 6], [4, 6]]
>>> row = GetProbes(round_eq=2).execute(client, 'test_nsdi_example')[0]
>>> sorted(row["probes_per_ttl"])
[[1, 11], [2, 18], [3, 18], [4, 18]]
>>> row = GetProbes(round_eq=3).execute(client, 'test_nsdi_example')[0]
>>> sorted(row["probes_per_ttl"])
[[1, 11], [2, 20], [3, 27], [4, 27]]
>>> row = GetProbes(round_eq=3, probe_ttl_geq=2, probe_ttl_leq=3).execute(client, 'test_nsdi_example')[0]
>>> sorted(row["probes_per_ttl"])
[[2, 20], [3, 27]]
>>> GetProbes(round_eq=4).execute(client, 'test_nsdi_example')
[]
Source code in diamond_miner/queries/get_probes.py
GetProbesDiff
Bases: ProbesQuery
Return the number of probes sent at a specific round and at the previous round.
Examples:
>>> from diamond_miner.test import client
>>> from diamond_miner.queries import GetProbesDiff
>>> row = GetProbesDiff(round_eq=1).execute(client, 'test_nsdi_example')[0]
>>> row["probe_protocol"]
1
>>> row["probe_dst_prefix"]
'::ffff:200.0.0.0'
>>> sorted(row["probes_per_ttl"])
[[1, 6, 0], [2, 6, 0], [3, 6, 0], [4, 6, 0]]
>>> row = GetProbesDiff(round_eq=2).execute(client, 'test_nsdi_example')[0]
>>> sorted(row["probes_per_ttl"])
[[1, 11, 6], [2, 18, 6], [3, 18, 6], [4, 18, 6]]
>>> row = GetProbesDiff(round_eq=3).execute(client, 'test_nsdi_example')[0]
>>> sorted(row["probes_per_ttl"])
[[1, 11, 11], [2, 20, 18], [3, 27, 18], [4, 27, 18]]
>>> GetProbesDiff(round_eq=4).execute(client, 'test_nsdi_example')
[]
Source code in diamond_miner/queries/get_probes.py
GetResults
dataclass
Bases: ResultsQuery
Return all the columns from the results table.
Examples:
>>> from diamond_miner.test import client
>>> from diamond_miner.queries import GetResults
>>> rows = GetResults().execute(client, 'test_nsdi_example')
>>> len(rows)
85
Source code in diamond_miner/queries/get_results.py
GetSlidingPrefixes
dataclass
Bases: ResultsQuery
Get the prefixes to probe for a given sliding window.
Source code in diamond_miner/queries/get_sliding_prefixes.py
stopping_condition: int = 0
class-attribute
instance-attribute
Number of stars to allow.
window_max_ttl: int = 0
class-attribute
instance-attribute
Set to 0 to return every prefix.
InsertLinks
dataclass
Bases: ResultsQuery
Insert the results of the GetLinksFromResults
query into the links table.
Source code in diamond_miner/queries/insert_links.py
InsertMDAProbes
dataclass
Bases: GetMDAProbes
Insert the result of the GetMDAProbes
queries
into the probes table.
Source code in diamond_miner/queries/insert_mda_probes.py
InsertPrefixes
dataclass
Bases: ResultsQuery
Insert the results of the GetPrefixesWithAmplification
and GetPrefixesWithLoops
queries into the prefixes table.
Source code in diamond_miner/queries/insert_prefixes.py
InsertResults
dataclass
Bases: Query
Insert measurement results from a CSV file.
Source code in diamond_miner/queries/insert_results.py
LinksQuery
dataclass
Bases: Query
Base class for queries on the links table.
Source code in diamond_miner/queries/query.py
filter_inter_round: bool = False
class-attribute
instance-attribute
If true, exclude links inferred across rounds.
filter_partial: bool = False
class-attribute
instance-attribute
If true, exclude partial links: ('::', node)
and (node, '::')
.
filter_virtual: bool = False
class-attribute
instance-attribute
If true, exclude virtual links: ('::', '::')
.
near_or_far_addr: str | None = None
class-attribute
instance-attribute
If specified, keep only the links that contains this IP address.
probe_protocol: int | None = None
class-attribute
instance-attribute
If specified, keep only the links inferred from probes sent with this protocol.
probe_src_addr: str | None = None
class-attribute
instance-attribute
If specified, keep only the links inferred from probes sent by this address. This filter is relatively costly (IPv6 comparison on each row).
round_eq: int | None = None
class-attribute
instance-attribute
If specified, keep only the links from this round.
round_leq: int | None = None
class-attribute
instance-attribute
If specified, keep only the links from this round or before.
filters(subset)
WHERE
clause common to all queries on the links table.
Source code in diamond_miner/queries/query.py
PrefixesQuery
dataclass
Bases: Query
Base class for queries on the prefixes table.
Source code in diamond_miner/queries/query.py
probe_protocol: int | None = None
class-attribute
instance-attribute
If specified, keep only the links inferred from probes sent with this protocol.
probe_src_addr: str | None = None
class-attribute
instance-attribute
If specified, keep only the links inferred from probes sent by this address. This filter is relatively costly (IPv6 comparison on each row).
filters(subset)
WHERE
clause common to all queries on the prefixes table.
Source code in diamond_miner/queries/query.py
ProbesQuery
dataclass
Bases: Query
Base class for queries on the probes table.
Source code in diamond_miner/queries/query.py
probe_protocol: int | None = None
class-attribute
instance-attribute
If specified, keep only probes sent with this protocol.
probe_ttl_geq: int | None = None
class-attribute
instance-attribute
If specified, keep only the probes with TTL >= this value.
probe_ttl_leq: int | None = None
class-attribute
instance-attribute
If specified, keep only the probes with TTL <= this value.
round_eq: int | None = None
class-attribute
instance-attribute
If specified, keep only the probes from this round.
round_geq: int | None = None
class-attribute
instance-attribute
If specified, keep only the probes from this round or after.
round_leq: int | None = None
class-attribute
instance-attribute
If specified, keep only the probes from this round or before.
round_lt: int | None = None
class-attribute
instance-attribute
If specified, keep only the probes from before this round.
filters(subset)
WHERE
clause common to all queries on the probes table.
Source code in diamond_miner/queries/query.py
Query
dataclass
Base class for every query.
Source code in diamond_miner/queries/query.py
62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 |
|
execute(client, measurement_id, *, data=None, limit=None, subsets=(UNIVERSE_SUBSET))
Execute the query and return each row as a dict. Args: client: ClickHouse client. measurement_id: Measurement id. data: str or bytes iterator containing data to send. limit: (limit, offset) tuple. subsets: Iterable of IP networks on which to execute the query independently.
Source code in diamond_miner/queries/query.py
execute_concurrent(client, measurement_id, *, subsets=(UNIVERSE_SUBSET), limit=None, concurrent_requests=max(available_cpus() // 8, 1))
Execute the query concurrently on the specified subsets.
Source code in diamond_miner/queries/query.py
execute_iter(client, measurement_id, *, data=None, limit=None, subsets=(UNIVERSE_SUBSET))
Execute the query and return each row as a dict, as they are received from the database.
Source code in diamond_miner/queries/query.py
ResultsQuery
dataclass
Bases: Query
Base class for queries on the results table.
Source code in diamond_miner/queries/query.py
filter_destination_host: bool = True
class-attribute
instance-attribute
If true, ignore the replies from the destination host.
filter_destination_prefix: bool = True
class-attribute
instance-attribute
If true, ignore the replies from the destination prefix.
filter_invalid_probe_protocol: bool = True
class-attribute
instance-attribute
If true, ignore the replies with probe protocol ≠ ICMP, ICMPv6 or UDP.
filter_private: bool = True
class-attribute
instance-attribute
If true, ignore the replies from private IP addresses.
probe_protocol: int | None = None
class-attribute
instance-attribute
If specified, keep only the replies to probes sent with this protocol.
probe_src_addr: str | None = None
class-attribute
instance-attribute
If specified, keep only the replies to probes sent by this address. This filter is relatively costly (IPv6 comparison on each row).
round_eq: int | None = None
class-attribute
instance-attribute
If specified, keep only the replies from this round.
round_leq: int | None = None
class-attribute
instance-attribute
If specified, keep only the replies from this round or before.
time_exceeded_only: bool = True
class-attribute
instance-attribute
If true, ignore non ICMP time exceeded replies.
filters(subset)
WHERE
clause common to all queries on the results table.
Source code in diamond_miner/queries/query.py
StoragePolicy
dataclass
Source code in diamond_miner/queries/query.py
archive_on: datetime = datetime(2100, 1, 1)
class-attribute
instance-attribute
Date at which the table will be moved to the archive volume.
archive_to: str = 'default'
class-attribute
instance-attribute
Name of the ClickHouse archive volume.
name: str = 'default'
class-attribute
instance-attribute
Name of the ClickHouse storage policy to use for the table.