Files
graph_recognition_w_attn/__pycache__/model.cpython-312.pyc

66 lines
7.0 KiB
Plaintext
Raw Normal View History

2025-07-31 01:12:53 -04:00
<EFBFBD>
2025-09-01 14:46:34 -04:00
<00>`<60>h<EFBFBD><00>
2025-07-31 01:12:53 -04:00
<00>,<00>ddlZddlmZddlmZddlmZmZddlZddl m
Z
 ddejde de de fd <09>Zdejd
efd <0B>Zd ed ejd
efd<0E>Zd ed
efd<0F>Zd
edejdejdejdef
d<14>Zy)<16>N)<01>random)<02> ModelConfig<69> TrainConfig)<01>partial<61>key<65> in_features<65> out_features<65>use_biasc<00><><00>tj|<00>\}}tjd|z <00>}tj|||f| |<06><02>}d|i}|r tj||f| |<06><02>} | |d<|S)z?
Initializes the weights and biases in a linear layer.
<20>)<02>minval<61>maxval<61>W<>b)r<00>split<69>jnp<6E>sqrt<72>uniform)
rrr r
<00>key_w<5F>key_b<5F>limitr<00>paramsrs
2025-08-04 12:44:35 -04:00
<20>3/home/letpon/code/graph_recognition_w_attn/model.py<70>init_linear_layerr
sv<00><00><1A><<3C><<3C><03>$<24>L<EFBFBD>E<EFBFBD>5<EFBFBD> <0F>H<EFBFBD>H<EFBFBD>Q<EFBFBD>{<7B>]<5D> #<23>E<EFBFBD><0E><0E><0E>u<EFBFBD>{<7B>L<EFBFBD>9<>5<EFBFBD>&<26>QV<51>W<>A<EFBFBD><11>1<EFBFBD>X<EFBFBD>F<EFBFBD><0F> <12>N<EFBFBD>N<EFBFBD>5<EFBFBD><<3C>/<2F>5<EFBFBD>&<26><15> O<><01><17><06>s<EFBFBD> <0B> <11>M<EFBFBD><00>configc<00>h<00>tj|d<01>\}}}}dtj||j|jf<02><03>it ||j |j<00>t ||jd|jzd<05><06>t ||j|j<00>d<07>}|S)z<
2025-07-31 01:12:53 -04:00
Initializes all model parameters. Returns a pytree
<20><00>weight)<01>shape<70>F)r
)<04>agent_embeddings<67> translate<74> attn_proj<6F>head)rr<00>normal<61>
num_agents<EFBFBD> embedding_dimr<00> input_dim<69>
2025-08-04 12:44:35 -04:00
output_dim)rr<00> key_embed<65> key_translate<74> key_attn_proj<6F>key_headrs r<00>init_fnr/s<><00><00>
2025-07-31 01:12:53 -04:00
9?<3F> <0C> <0C>S<EFBFBD>!<21>8L<38>5<>I<EFBFBD>}<7D>m<EFBFBD>X<EFBFBD> <15>v<EFBFBD>}<7D>}<7D>Y<EFBFBD>v<EFBFBD>7H<37>7H<37>&<26>J^<5E>J^<5E>6_<36>`<60>
<EFBFBD>'<27>}<7D>f<EFBFBD>6F<36>6F<36><06>H\<5C>H\<5C>]<5D>&<26>}<7D>f<EFBFBD>6J<36>6J<36>A<EFBFBD>PV<50>Pd<50>Pd<50>Ld<4C>ot<6F>u<>!<21>(<28>F<EFBFBD>,@<40>,@<40>&<26>BS<42>BS<42>T<> <06>F<EFBFBD> <12>Mrr<00>input_timestepsc<00><><00>|j\}}}|dd}tj||||jf<03>}||ddz}tj|dd<06><07>\}} ||ddz|dd z}
| |j d
dd <0B>ztj |<04>z } tjj| d<06><07>} | |
z} | |d dz|d d z}|S) zd
2025-09-01 14:46:34 -04:00
Model's forward function. Takes in the parameters and input timesteps, returns predictions
2025-07-31 01:12:53 -04:00
r"rr$rr!<00><><EFBFBD><EFBFBD><EFBFBD><EFBFBD><01>axisr#rr<00>r%)
r r<00> broadcast_tor(r<00> transposer<00>jax<61>nn<6E>softmax)rr0r<00>
batch_sizer'<00>_<> agent_embed<65> attn_proj_out<75>k<>q<>v<>
att_scores<EFBFBD> att_weights<74>weighted_average<67>
2025-08-04 12:44:35 -04:00
predictions r<00>forwardrF/s<><00><00>!0<> 5<> 5<><1D>J<EFBFBD>
2025-07-31 01:12:53 -04:00
<EFBFBD>A<EFBFBD><18>+<2B>,<2C>X<EFBFBD>6<>K<EFBFBD><15>"<22>"<22>;<3B><1A>Z<EFBFBD><16>I]<5D>I]<5D>0^<5E>_<>K<EFBFBD><1F>&<26><1B>"5<>c<EFBFBD>":<3A>:<3A>M<EFBFBD> <0E>9<EFBFBD>9<EFBFBD>]<5D>A<EFBFBD>B<EFBFBD> /<2F>D<EFBFBD>A<EFBFBD>q<EFBFBD><17>&<26><1B>-<2D>c<EFBFBD>2<>2<>V<EFBFBD>K<EFBFBD>5H<35><13>5M<35>M<>A<EFBFBD><13>a<EFBFBD>k<EFBFBD>k<EFBFBD>!<21>Q<EFBFBD><01>*<2A>*<2A>c<EFBFBD>h<EFBFBD>h<EFBFBD>z<EFBFBD>.B<>B<>J<EFBFBD><15>&<26>&<26>.<2E>.<2E><1A>"<22>.<2E>5<>K<EFBFBD>"<22>Q<EFBFBD><EFBFBD><14>!<21>F<EFBFBD>6<EFBFBD>N<EFBFBD>3<EFBFBD>$7<>7<>&<26><16>.<2E><13>:M<>M<>J<EFBFBD> <15>rc<00><><00>|dd}||ddz}tj|dd<06><07>\}}||jztj|jd<00>z }tj
|<06>S)zw
Calculates and returns the learned attention matrix between agents.
This is a pure function for analysis.
r"rr$rr!r2r3)rr<00>Trr <00>asarray)rr<00>
2025-08-04 12:44:35 -04:00
embeddingsr>r?r@<00> attn_scoress r<00>get_attention_fnrLBst<00><00>
2025-07-31 01:12:53 -04:00
<18>*<2A>+<2B>H<EFBFBD>5<>J<EFBFBD><1F><16> <0B>!4<>S<EFBFBD>!9<>9<>M<EFBFBD> <0E>9<EFBFBD>9<EFBFBD>]<5D>A<EFBFBD>B<EFBFBD> /<2F>D<EFBFBD>A<EFBFBD>q<EFBFBD><15>q<EFBFBD>s<EFBFBD>s<EFBFBD>7<EFBFBD>c<EFBFBD>h<EFBFBD>h<EFBFBD>q<EFBFBD>w<EFBFBD>w<EFBFBD>r<EFBFBD>{<7B>3<>3<>K<EFBFBD> <0E>;<3B>;<3B>{<7B> #<23>#r<00>inputs<74>targets<74>
true_graph<EFBFBD> train_configc <00><><00><17><18>tjd<01>}tj|<05>\}}t||<00>}t j
|j <00><00><18>j|<07>}d<02><00>ttjdg<01><04><00><17>fd<05><08>} t|j<00>D<00>
cic]}
d|
<EFBFBD><00>g<00><02>
} }
t|j<00>D<00>
cic]}
d|
<EFBFBD><00>g<00><02>
} }
t|<01>} t|j<00>D]<5D>}d}t| <0A>D]!}||||}}| |||||<00>\}}}||z }<0F>#|| z }| d|<0E><00>j|<14>|jr!|dzd zdk(rt!d
|dzd <0B>d |d <0A><04><04>|j"s<01><>||j$zdk(s<01><>t'||<00>}| d|<0E><00>j|<15><00><>| | |d<0E>}||fScc}
wcc}
w)Nrc<00>v<00>t|||<03>}tjtj||z
2025-08-04 12:44:35 -04:00
<00><00>}|S<00>N)rFr<00>mean<61>abs)<06>p<>x_batch<63>y_batchr<00> predictions<6E>losss r<00>loss_fnztrain_model.<locals>.loss_fn^s1<00><00><1D>a<EFBFBD><17>&<26>1<> <0B><12>x<EFBFBD>x<EFBFBD><03><07><07> <0B>g<EFBFBD> 5<>6<>7<><04><13> rr)<01>static_argnamesc<00><><00><02>tj<00>
2025-07-31 01:12:53 -04:00
<EFBFBD>||||<04>\}}<06> j|||<00>\}}tj||<07>} | ||fSrS)r8<00>value_and_grad<61>update<74>optax<61> apply_updates) r<00> opt_staterWrXr<00>loss_val<61>grads<64>updates<65> new_opt_state<74>
2025-08-04 12:44:35 -04:00
new_paramsr[<00> optimizers <20><>r<00> update_stepz train_model.<locals>.update_stepcs_<00><><00>6<>#<23>,<2C>,<2C>W<EFBFBD>5<>f<EFBFBD>g<EFBFBD>w<EFBFBD>PV<50>W<><0F><08>%<25>!*<2A>!1<>!1<>%<25><19>F<EFBFBD>!K<><1E><07><1D><1A>(<28>(<28><16><17>9<>
2025-07-31 01:12:53 -04:00
<EFBFBD><19>=<3D>(<28>2<>2r<00>epoch_gr5<00>
zEpoch <20>3dz | Loss: z.6f)<03> loss_history<72>graphsrO)r<00>PRNGKeyrr/r`<00>adamw<6D> learning_rate<74>initrr8<00>jit<69>range<67>epochs<68>len<65>append<6E>verbose<73>print<6E>log<6F>log_epoch_intervalrL)rrMrNrOrPr<00>init_keyrrbri<00>irmrn<00> num_batches<65>epoch<63> running_loss<73> batch_num<75>x<>yrc<00>
2025-08-04 12:44:35 -04:00
epoch_loss<EFBFBD>attn<74>all_logsr[rhs @@r<00> train_modelr<6C>Rs<00><><00>
2025-07-31 01:12:53 -04:00
<11>.<2E>.<2E><11>
<1B>C<EFBFBD><1A>L<EFBFBD>L<EFBFBD><13>%<25>M<EFBFBD>C<EFBFBD><18> <14>X<EFBFBD>v<EFBFBD> &<26>F<EFBFBD><15> <0B> <0B>L<EFBFBD>6<>6<>7<>I<EFBFBD><19><0E><0E>v<EFBFBD>&<26>I<EFBFBD><14>
2025-09-01 14:46:34 -04:00
 <0A>S<EFBFBD>W<EFBFBD>W<EFBFBD>x<EFBFBD>j<EFBFBD>1<>3<>2<>3<>/4<>L<EFBFBD>4G<34>4G<34>.H<>I<><11>f<EFBFBD>Q<EFBFBD>C<EFBFBD>L<EFBFBD>"<22>$<24>I<>L<EFBFBD>I<>(-<2D>l<EFBFBD>.A<>.A<>(B<> C<>1<EFBFBD><06>q<EFBFBD>c<EFBFBD>l<EFBFBD>B<EFBFBD><1E> C<>F<EFBFBD> C<><15>f<EFBFBD>+<2B>K<EFBFBD><16>|<7C>*<2A>*<2A>+<2B>2<><05><1A> <0C><1E>{<7B>+<2B> %<25>I<EFBFBD><19>)<29>$<24>g<EFBFBD>i<EFBFBD>&8<>q<EFBFBD>A<EFBFBD>*5<>f<EFBFBD>i<EFBFBD><11>A<EFBFBD>v<EFBFBD>*V<> '<27>F<EFBFBD>I<EFBFBD>x<EFBFBD> <18>H<EFBFBD> $<24>L<EFBFBD>  %<25>"<22>K<EFBFBD>/<2F>
2025-07-31 01:12:53 -04:00
<EFBFBD><14>v<EFBFBD>e<EFBFBD>W<EFBFBD>%<25>&<26>-<2D>-<2D>j<EFBFBD>9<> <17> <1F> <1F>U<EFBFBD>Q<EFBFBD>Y<EFBFBD>"<22>$4<><01>$9<> <11>F<EFBFBD>5<EFBFBD><11>7<EFBFBD>2<EFBFBD>,<2C>i<EFBFBD>
2025-09-01 14:46:34 -04:00
<EFBFBD>3<EFBFBD>/?<3F>@<40> A<> <17> <1B> <1B><05> <0C>(G<>(G<> G<>1<EFBFBD> L<>#<23>F<EFBFBD>F<EFBFBD>3<>D<EFBFBD> <12>V<EFBFBD>E<EFBFBD>7<EFBFBD>#<23> $<24> +<2B> +<2B>D<EFBFBD> 1<>#2<>(%<25><18> <20><06>H<EFBFBD> <12>8<EFBFBD> <1B><1B><>=J<01><> Cs <00>+ G <04> G)T)r8r<00> jax.numpy<70>numpyr<00>config_rrr`<00> functoolsr<00>Array<61>int<6E>boolrr/<00>dictrFrLr<><00>rr<00><module>r<>s<><00><01>
<EFBFBD><16><17>,<2C> <0C><1D><1E> <12> <10>Y<EFBFBD>Y<EFBFBD><12><18><12><1A><12><17> <12>&<12><13><19><19><12>K<EFBFBD><12>$<16>D<EFBFBD><16>3<EFBFBD>9<EFBFBD>9<EFBFBD><16>k<EFBFBD><16>&$<24>T<EFBFBD>$<24>;<3B>$<24> 9<1C> <0B>9<1C>S<EFBFBD>Y<EFBFBD>Y<EFBFBD>9<1C><13><19><19>9<1C><1F>I<EFBFBD>I<EFBFBD>9<1C>)<29>9r