Dependencies are wrong #19

MrGranddy · 2023-07-18T22:56:15Z

Hello, I have tried lots of different version combinations to make the LLaMA script work, it produces very bad results which is
also what I observed with my own implementation and some other implementation for SparseGPT LLaMA.

All 3 of these implementations produce exactly the same results, which is good it shows probably we are doing everything correctly,
but then the performance is incredibly poor for LLaMA, it performs even worse than BLOOM or OPT.

If your results are better can you please share the exact dependencies to repeat your experiments, because the transformers
library version you give in the README does not even have LLaMA tokenizer etc.

Thank you

efrantar · 2023-07-20T14:27:20Z

Hi, what do you mean by "very bad results"? As also discussed in #7, pruning LLaMa seems to be more challenging than pruning e.g. OPT, possibly because it is more parameter efficient. I just ran --sparsity .5 on the 7B model with pretty recent package versions (transformers==4.31.0, datasets==2.13.1 and torch==2.0.1) and got 7.20 PPL for Wiki and 9.29 for C4 PPL (some package version newer than the ones we list in the README seems to have broken PTB numbers in general, not sure why). What numbers do you get?

MrGranddy · 2023-07-24T00:09:58Z

Hello, I make evaluations on some standart LLM evaluation tasks, using "LLM Evaluation Harness":
https://github.com/EleutherAI/lm-evaluation-harness

I get the following results for LLaMA:

LLaMA-7B	Dense	Magnitude 50%	SparseGPT 50%	SparseGPT 2:4
arc_challenge (acc_norm)	0.4138	0.302	0.2833	0.291
arc_easy (acc_norm)	0.5248	0.2702	0.2588	0.266
boolq (acc)	0.7315	0.6214	0.6193	0.3823

Normally I would expect some performance drop yet for comparison here are the results for BLOOM-7B:

BLOOM-7B1	Dense	Magnitude 50%	SparseGPT 50%	SparseGPT 2:4
arc_challenge (acc_norm)	0.3336	0.3072	0.3055	0.2722
arc_easy (acc_norm)	0.5728	0.5261	0.5316	0.4945
boolq (acc)	0.6291	0.6064	0.6303	0.6226

So probably there is something wrong with the implementation, as I mentioned my own implementation also get the same results, so I would like to compare it with your results. Can you please do the experiments with the latest version of transformers so we can validate?

MrGranddy · 2023-07-24T00:12:31Z

Sorry I've closed the issue by accident, I would be glad if you can re-open so we can solve the issue. I also tried the experiment with multiple torch, python and transformers versions, if your results are better, I would expect that it works for a very spesific version of libraries for some reason.

MrGranddy closed this as completed Jul 24, 2023

MrGranddy reopened this Jul 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dependencies are wrong #19

Dependencies are wrong #19

MrGranddy commented Jul 18, 2023

efrantar commented Jul 20, 2023

MrGranddy commented Jul 24, 2023

MrGranddy commented Jul 24, 2023

Dependencies are wrong #19

Dependencies are wrong #19

Comments

MrGranddy commented Jul 18, 2023

efrantar commented Jul 20, 2023

MrGranddy commented Jul 24, 2023

MrGranddy commented Jul 24, 2023