### Graph framework validation

The graph formulation framework was modified to account for vegetation and relevant building features. To confirm the accuracy of these additions to the graph model^{6}, we conducted validation tests on Paradise, California (US), which was devastated during the historic 2018 Camp Fire. The Camp Fire started in Butte County in Northern California due to a faulty electric transmission line^{38}. In a matter of hours, it reached the community of Paradise to cause significant devastation. In addition to unfavorable wind and climate conditions, high urban and wildland vegetation density in and around residential homes in Paradise fueled the intensity of the wildfire^{37,38}.

We conducted a sensitivity analysis to test the efficacy of the graph framework by quantifying the impact of wildland vegetation and buildings along with urban vegetation in the buildings’ proximity to community vulnerability. We introduce two parameters—(1) (v_w in [0,90]) and (2) (v_u in [0,90]), such that the former represents the percentage reduction in wildland vegetation and the latter represents the reduction in building nodes along with urban vegetation. We model the reduction in wildland vegetation by removing vegetation nodes and the reduction in building nodes and urban vegetation by removing building nodes (see section “Modeling vegetation”). For different permutations of the wildland and building densities, the wildfire graph is first formulated by evaluating the ignition probabilities (P_{tr}^{(i,j)}) between all node pairs (*i*, *j*). We evaluate each node’s vulnerability by identifying the Most Probable Paths (MPPs) that correspond to paths with the highest probability for fire propagation from an ignited node to a non-ignited one (see “Materials and methods”). We calculate the mean vulnerability of all building nodes to represent the entire community vulnerability. Based on the value selected for the density factors, the (v_w) percentage of vegetation nodes and (v_u) percentage of building nodes are selected randomly for removal. We repeat the selection process in a Monte–Carlo simulation to eliminate bias for (K = 100) iterations. The mean probability is evaluated as the average mean vulnerability of all nodes considered within the testbed for all iterations. Heatmaps are generated for three different wind speeds—10 m/s, 15 m/s, and 20 m/s, indicating the variation in mean vulnerability for different vegetation and building density. As expected, the vulnerability is observed to be maximum for (v_w = 0) and (v_u = 0) and minimum for (v_w = 90) and (v_u = 90). The pattern of variation observed in the heatmaps, as shown in Fig. 2, is in accordance with expectations.

We also conducted a vulnerability analysis on Paradise to compare the calculated vulnerability patterns under the historic Camp Fire conditions and with the observed damage. A graph for the community of Paradise is created by utilizing pre-fire building and vegetation fuel GIS data for Paradise. A wind speed of 15 m/s with a north-east to south-west direction is assumed, similar to that observed during Camp Fire. Figure S1 in the SI compares the observed damage from the Camp Fire as outlined in a post-fire study conducted by NIST^{39} with the calculated vulnerability. Nodes with high vulnerability values suggest early ignitions compared to other nodes with lower values. The pattern of ignitions observed during the fire coincides with the calculated high vulnerability nodes, suggesting that the graph model framework can capture wildfire interaction with the built environment.

### Node influence metric

To determine the survival likelihood of individual structures within wildfire-affected regions, we borrow concepts from graph theory to assess the relative vulnerability (V_r^{(i)}) of structures in a wildfire event. The concept of vulnerability can be anchored in the notion of nodal importance, particularly centrality measures^{40}, and node influence^{41,42}. In the context of graphs, centrality measures are best described as indicators of importance for determining the influence of nodes within a network based on specific criteria. Decades of research have led to significant strides in identifying influential nodes within networks, and the concept has found widespread application in different fields. Some prominent applications of centrality entail the identification of the most influential persons in a social network^{22}, super-spreaders of disease^{41}, and several others. There are different types of centrality measures in the literature, each effective for specific applications.

In this study, we first evaluate the ability of traditional centrality measures to assess the survival likelihood of buildings. We tested the following widely accepted centrality measures to determine the vulnerability of individual nodes in a graph network—(1) Closeness (Fig. S2 of SI text), (2) Eigenvector (Fig. S3 of SI text), (3) Clustering coefficient (Fig. S4 of SI text), (4) Gravity (Fig. S5 of SI text), (5) Degree (Fig. S6 of SI text), and (6) Betweenness centrality^{6}. Each measure was tested on two major wildfires in the US—(1) the 2018 Camp Fire and (2) the 2020 Glass Fire. Both fires are considered among the most destructive fires in the history of California. While in the case of the Camp Fire, high-density urban vegetation around houses resulted in the spread of wildfire, in the case of the Glass Fire, wildland vegetation was the governing factor responsible for wildfire spread. The graph formulated for Camp Fire testbed comprises 11,945 building nodes and 4685 vegetation nodes, while the graph for Glass Fire comprises 3596 building nodes and 15,834 vegetation nodes. Based on the DINS database, 10,923 buildings were damaged during the Camp Fire testbed and 1027 buildings during the Glass Fire. In both cases, similar wind conditions are considered for analysis—wind speed (v_w = 15) m/s and wind direction (theta _w = 225^o), measured counter-clockwise from the x-axis. The vulnerability value calculated for individual nodes is converted to the damage state (see “Damage comparison” section in “Materials and methods”). The calculated damage states of individual nodes are compared to the observed damage states by measuring the prediction accuracy (P_{a}) (see “Damage comparison” section in “Materials and methods”), which is calculated based on the number of damaged and undamaged nodes. From the results, it is observed that all centrality measures exhibit low maximum prediction accuracy ((approx 50%)), but the degree centrality showed slightly higher maximum prediction accuracy ((approx 55%)). This is because most centrality measures are not informative for the vast majority of network nodes^{42}; instead, they tend to focus on a small number of highly influential nodes, resulting in an underestimation of the spreading power of nodes^{43} (see section 4 in the SI).

A more intuitive approach would be to measure the spreading capacity of nodes to assess the impact of buildings on fire ignitions. Node influence metrics are distinct from centrality measures and explicitly determine highly influential nodes in any network during a spreading process^{44,45}. Some approaches to quantify the spreading power of nodes have been proposed in the last decade. One such measure is accessibility^{46,47}, which utilizes the concept of random walks to measure how accessible the rest of the network is from a given initiation node. A random walk on a graph can be defined as a random process of sequential selection of nodes and edges to traverse from a particular node on a graph. Another measure is the expected force, developed using concepts of information entropy and random walks, which can assess the strength of spreading power generated by a node^{42}. The concept of random walks can be considered relevant in determining the capacity of a node to transmit to other nodes.

### Relative vulnerability metrics

It is noted from the results that all centralities that consider the impact of far-off nodes, like eigenvector, closeness, gravity, and betweenness, appear to be ineffective for the research problem in question, as reflected by their low prediction accuracy. On the other hand, degree centrality, which takes into account the local (or short-range) impact, is observed to perform relatively better (higher prediction accuracy). Degree centrality can be classified into – indegree and outdegree, such that the former refers to the cumulative impact of edges directed towards a node, and the latter refers to the effect of a node on others. In the context of fires, the indegree centrality can be considered a measure of the likelihood of ignition of a node when all its neighbors are ignited, while the outdegree can be regarded as a measure of the capacity of a node to spread fire to its neighboring nodes. It can be hypothesized that the chance of structural ignition is strongly correlated to the ignition of neighboring structures^{35}; hence the definition of indegree centrality is well suited for our intended application. In addition, the concept of random walks is related to the spreading power of a node, as it provides the theoretical framework to capture the randomness in the spread of wildfires from one node to its neighboring nodes. Two formulations are proposed in this study to evaluate the relative vulnerability of individual nodes—(1) a modified degree formulation ((V_{md})) and (2) a modified random walk formulation ((V_{rd})).

#### Modified degree formulation

The first formulation, modified degree (V_{md}), is based on the concept of indegree. In this formulation, the relative vulnerability of an individual node is assessed as the mean of incoming edge weights from all neighboring nodes (N^{(i)}), given by Eq. (6), where (n^{(i)}) is the number of neighboring nodes. The neighboring nodes to each node *i* are defined by the set of nodes (N^{(i)}) that has a probability of igniting a target node greater than zero. We introduce an additional constraint to improve the accuracy of the modified degree formulation. In the study by Liu et al.^{48}, the authors demonstrated that by removing low-impact links, the spreading ability of each node could be better ascertained. In the context of wildfires, we hypothesize that in most cases, low-impact neighbors do not contribute to structural ignition. Low-impact neighbors are defined as the neighbors with an ignition probability, (P_{tr}^{(N^{(i)},i)}) towards the target node *i*, below a certain threshold probability (P_{th}), as shown in Eq. (7). Accordingly, we remove all low-impact (probability) connections between different node-pairs from the graph ({mathscr {G}}), such that ({mathscr {E}}^o = {mathscr {E}} – epsilon), to obtain the modified graph ({mathscr {G}}^o), where (epsilon) is a set containing all edges with weights below the threshold value. The framework of the modified degree formulation (V_{md}) is also described in Fig. 3.

$$begin{aligned} V_{md}^{(i)}= & {} frac{sum _{k in N^{(i)}} P_{tr}^{(k,i)}}{n^{(i)}} end{aligned}$$

(6)

$$begin{aligned} P_{tr}^{(N^{(i)},i)}= & {} Big {0 | P_{tr}^{(N^{(i)},i)} le P_{th} Big } end{aligned}$$

(7)

We test the modified degree formulation on both testbeds and measure its effectiveness by developing a survivability plot to express the survival likelihood of individual buildings into different vulnerability classes, as discussed in “Materials and methods”. The respective vulnerability map, distribution, and survival plots are shown in Fig. S7 of the SI. The plot represents the survival probability of buildings in each class of the calculated relative vulnerability. The survival probability for the lower class vulnerability values is expected to be higher than for the higher vulnerability classes. Thus, a strictly decreasing curve pattern would suggest a positive correlation between calculated relative vulnerability values and the observed damage states. For the Camp Fire, most buildings were within high-density vegetation; as a result, a higher number of structures were destroyed. While in the case of the Glass Fire, most buildings had sparse vegetation in their surroundings, resulting in relatively lower losses. The impact of removing low-impact connections from the formulated graph is also tested. The survival curves for the two testbeds are shown in the SI in Fig. S9a and c for the case without removal and Fig. S9b and d for the case with removal. From the shape of the survival curves, it can be observed that for the latter case (after removal of low probability links), the survival curves are strictly monotonically decreasing, suggesting a better classification of vulnerability classes. Thus, by removing low-impact connections within the graph, the survival likelihood of individual nodes can be better ascertained. In addition, the prediction accuracy calculated from the vulnerability values (Fig. S8 in the SI) shows maximum accuracy of (57.9%) for the Camp Fire and (60.4%) for the Glass Fire, which is higher than other node influence metrics tested (Section 4 in the SI).

#### Modified random walk formulation

In the modified random walk formulation (v_{mrw}), as the initial step, we evaluate the transmissibility (t^{(i)}) of each node using Eq. (8). A set of (R in {r_{(1)}, ldots , r_{(w)}}) random walks are generated for any node *i*, such that (r)th walk for a node is defined as (r_{(w)}^{(i)} = { i xrightarrow {e^{(1)}} v^{(1)} xrightarrow {e^{(h)}} v^{(h)} ldots v^{(lambda )} }), where (v^{(h)}) and (e^{(h)}) are the node and edge indices at step (*h*) and (lambda) is the maximum step size considered for random walk. At each step of the walk (r_{(w)}^{(i)}), the subsequent node index (v^{(h+1)}) is determined by selection of one of the neighbors (N^{(v^{(h)})}) of node (v^{(h)}) at random.

$$begin{aligned} t^{(i)} = frac{sum _{r = 1}^{R} bigg [prod _{h in r_{(r)}^{(i)}} P_{tr}^{(e^{(h)})} bigg ]}{R} end{aligned}$$

(8)

It can be inferred from observations of post-fire studies that if a building is near fuels with high transmission capacity, the risk of ignition for the building can also be expected to be high^{2,30}. In other words, the vulnerability (or survivability) of a node can be considered proportional to the transmissibility (t^{(i)}) of its neighbors. We define the relative vulnerability as the mean transmissibility of all neighboring nodes, as given by Eq. (9). Similar to the assumption made in the previous formulation, we eliminate the transmissibility values below the threshold value (P_{th}) to obtain the neighboring node set (N^{(i)}). The steps involved in the random walk formulation are demonstrated in Fig. 3.

$$begin{aligned} V_{mrw}^{(i)} = frac{sum _{k in N^{(i)}} t^{(k)}}{n^{(i)}} end{aligned}$$

(9)

We test the modified random walk formulation, and the corresponding results for the two testbeds are shown in Fig. S10 of the SI. The prediction results showed maximum accuracy of (57.5%) for the Camp Fire and (63.8%) for the Glass Fire (Fig. S11 in the SI), which are better than other centrality measures tested (Section 4 in the SI). The survival plots show that the random walk formulation works better for the Glass Fire than the Camp Fire. The random walk overestimates the vulnerability compared with the modified degree formulation. A general observation for the two testbeds is that for most destroyed structures, there are more neighbors with a high probability of ignition than for survived structures. As a result, higher probability neighbors are selected more often for the random walks generated. In this formulation, weak edges (low probability of ignition) are given the same weight as other edges. However, in the case of the modified degree formulation, edges that are higher in number are given more weight. From the results, we observe that the modified random walk formulation can better identify nodes that lie at the extremes on the vulnerability scale. In contrast, the modified degree formulation works better for nodes with mid-range vulnerability values.

For the modified random walk formulation, the selected step size (lambda) has a noticeable impact on the accuracy. To determine the optimal step size, we tested different step sizes ranging from (lambda = 1) to (lambda = 4) for the two testbeds. Survival curves for different step sizes are shown in Figs. S12 and S13 in the SI. The optimal case is found to be for (lambda = 1), as the performance deteriorates with increasing step size. Based on the results from the modified degree formulation, we see that a structure’s survivability strongly depends on the impact of nodes at one degree of separation. As the step size increases, the effect of nodes further away is considered in calculations that create inaccuracies. The graph formulated for wildfire events exhibits high edge density per node; therefore, the distinction between nodal vulnerability diminishes as the step size increases. An underlying assumption made for this formulation is that selection of the next step (v_{(h)}) in a random walk (r_{(m)}^{(i)}) from one of the neighboring nodes is based on a uniform distribution. That is to say, each node in the node-set has an equal likelihood of getting selected.

#### Combined results

Based on the results of the modified degree and random walk formulations, it is evident that each formulation has its own advantage and limitation. In a way, the two formulations can be considered complementary to some extent. A combination of the two formulations defined as the weighted average, as shown in Eq. (10), is tested, and the results are shown in Fig. 4. (w_{md}) is the weight factor for the modified degree formulation, and (w_{mrw}) is the weight factor for the modified random walk formulation. For all analysis, equal weightage is given to both formulations i.e., (w_{md} = w_{mrw} = 0.50). The prediction results showed maximum accuracy of (58.15 %) for the Camp Fire and (63.15 %) for the Glass Fire. For the Camp Fire testbed, an improvement in prediction accuracy is observed for the combined formulation (Fig. S14a in the SI) over the degree and random walk formulations (Figs. S8a and S11a in the SI). While for the Glass Fire testbed, a slight decrease in improvement is observed (Fig. S14b in the SI) over the random walk formulation (Fig. S11b in the SI).

$$begin{aligned} V_{h}^{(i)} = w_{md}.V_{md}^{(i)} + w_{mrw}.V_{mrw}^{(i)} end{aligned}$$

(10)